Reward probability crucially determines the value of outcomes. A basic phenomenon, defying explanation by traditional decision theories, is that people often overweigh small and underweigh large probabilities in choices under uncertainty. However, the neuronal basis of such reward probability distortions and their position in the decision process are largely unknown. We assessed individual probability distortions with behavioral pleasantness ratings and brain imaging in the absence of choice. Dorsolateral frontal cortex regions showed experience dependent overweighting of small, and underweighting of large, probabilities whereas ventral frontal regions showed the opposite pattern. These results demonstrate distorted neuronal coding of reward probabilities in the absence of choice, stress the importance of experience with probabilistic outcomes and contrast with linear probability coding in the striatum. Input of the distorted probability estimations to decision-making mechanisms are likely to contribute to well known inconsistencies in preferences formalized in theories of behavioral economics.
The probability of reward crucially determines the value of choice options. To make optimal choices, traditional economic theories suggest we should process probability linearly (Von Neumann and Morgenstern, 1944; Pascal, 1948; Bernoulli, 1954). However, the prevalence of behaviors such as gambling and buying insurance suggests that we may not always process probability in a linear manner. Instead, in these examples, we tend to give more weight than objectively warranted to small probabilities (of winning or an adverse event occurring) and less weight to large probabilities (of losing or no adverse event occurring). Such subjective deviations from objective probabilities are called “probability distortions,” form a pervasive feature of human decision making, and build a cornerstone of modern economic theories, such as prospect theory (Allais, 1953; Kahneman and Tversky, 1979; Hershey and Schoemaker, 1980; Lattimore et al., 1992; Tversky and Kahneman, 1992; Camerer and Ho, 1994; Tversky and Wakker, 1995; Wu and Gonzalez, 1996; Wakker et al., 1997; Gonzalez and Wu, 1999; Abdellaoui, 2000; List and Haigh, 2005; Bleichrodt and Eeckhout, 2006).
Probability distortions come in two patterns. Overweighting of small (P < 0.3–0.4) and underweighting of large (P > 0.4) probabilities reflect an inverted S-shaped pattern of distortion (Kahneman and Tversky, 1979; Tversky and Kahneman, 1992; Prelec, 1998). Inverted S-patterned distortions occur primarily with verbal descriptions of probabilities not involving actual experience of outcomes. The second pattern concerns underweighting of small and overweighting of large probabilities, corresponds to regular S-shaped distortions, is relatively prevalent, and becomes even more prevalent with increasing experience of probabilistic outcomes (Kahneman and Tversky, 1979; Gigliotti and Sopher, 1993; Harbaugh et al., 2002; Kareev et al., 2002; Barron and Erev, 2003; Hertwig et al., 2004; Weber et al., 2004; Weber, 2006).
A fundamental issue for understanding behavior involving probabilistic outcomes concerns the stage of the decision process at which probability distortions occur. An obvious possibility is that distortions arise at actual choice stage, when people decide between options with different reward probabilities (Kahneman and Tversky, 1979). In this view, people estimate and represent stated probabilities accurately but distortions become manifest at the choice stage when the decision maker assigns decision weights to different options. An alternative, and not yet tested, possibility would be that individuals already perceive probabilities in a nonlinear manner before any decision process is engaged. As a consequence, probability distortion might occur at an earlier period, before decisions are actually made. Here we tested this possibility with a simple stimulus-reward association task not requiring overt choice.
A substantial body of evidence points to a role of lateral prefrontal cortex in processing probabilistic outcomes. This region is active during probabilistic decision-making (Heekeren et al., 2004, 2006). Lateral prefrontal neurons signal reward probabilities (Kobayashi et al., 2002) and process reward and action in stochastic situations (Barraclough et al., 2004). Ventral prefrontal regions process information that reflects the subjective experience of events including rewarding outcomes (Fuster, 1997; Duncan and Owen, 2000; Rolls, 2004). Given this, and given that regular S distortions may be experience-induced (Hertwig et al., 2004), we hypothesize that activation in ventral regions will show regular S distortions. Conversely, dorsal prefrontal regions are preferentially involved in cognitive and evaluative functions (Fuster, 1997; Duncan and Owen, 2000; Rolls, 2004). Inverted S distortions may represent a cognitive response to abstractly presented, rather than experienced, probabilities. We therefore hypothesize that inverted S distortions will occur in dorsal prefrontal regions. Given that different economic theories employ linear or distorted probability terms (Von Neumann and Morgenstern, 1944; Kahneman and Tversky, 1979) it is important to know whether all neuronal probability signals are distorted or whether some are linear. Previous research suggests linear reward probability coding in the striatum (Abler et al., 2006; Preuschoff et al., 2006). Conversely, individual differences in risk processing seem to be expressed more in prefrontal cortex than striatum (Tobler et al., 2007). We therefore hypothesized a neuronal dissociation between veridical and distorted probability processing in the striatum and prefrontal cortex, respectively.
Materials and Methods
Relation to prospect theory.
Prospect theory assumes nonlinear distortions both of probabilities and values. In contrast, traditional concepts assume linear probability and linear or distorted value terms (e.g., expected utility as sum of probability weighted nonlinear, subjective, values; expected value as sum of probability weighted linear, objective, values). In the present context, we focus on distortion of probabilities rather than values. Although we presently assume linear value coding and use only two magnitude levels, this does not exclude nonlinear value representation in prefrontal cortex or striatum. Also, the present study should not be taken as an attempt to identify a neuronal correlate of prospect. Rather, we focus on the potential requirement of overt choice for the occurrence of probability distortions.
The individual participants, the basic design of the experiments and the imaging techniques for recording the hemodynamic response of reward regions were identical to those previously reported (Tobler et al., 2007). Sixteen right-handed healthy participants (mean age, 27 years; range, 20–41 years; eight females) were investigated. Participants were preassessed to exclude previous histories of neurological or psychiatric illness. All participants gave informed consent, and the study was approved by the Joint Ethics Committee of the National Hospital for Neurology and Neurosurgery (United Kingdom).
To ensure that participants were aware of the stated probabilities, we showed and explained the stimulus-probability assignment before the start of the experiment and confirmed their understanding with basic questions. Participants were then placed on a moveable bed in the scanner with light head restraint to limit head movement during image acquisition. Participants viewed a computer monitor through a mirror fitted on top of the head coil. To study the processing of economic parameters regardless of choice, subjects performed in a simple conditioning paradigm in the scanner. At the beginning of a trial in the main scanning paradigm, single visual stimuli appeared for 1.5 s in one of the four quadrants of the monitor. Outcomes appeared 1 s after the stimulus for 0.5 s below the stimulus on the monitor such that outcome and stimulus presentation coterminated. Intertrial intervals varied between 1 and 8 s according to a truncated Poisson distribution with a mean of 3 s. To summarize, in each trial inside the scanner, participants viewed a stimulus, indicated with a button press where the stimulus had appeared, and viewed an outcome (Fig. 1 a). In a separate behavioral rating task performed outside the scanner before and after the experiment, stimuli were presented randomly and participants rated each stimulus on a scale from 5 (very pleasant) to −5 (very unpleasant). For the rating task, outcomes were not shown.
In each trial inside the scanner, we randomly presented one of 12 visual stimuli, each predicting reward with a specific magnitude and probability. Ten of these 12 stimuli were of interest for the present work. We used two levels of reward magnitude, 100 and 200 points, and five levels of reward probability, which varied between p = 0.0 and p = 1.0 in steps of 0.25 (the two stimuli of no interest were 300 and 400 both at p = 0.5). The two levels of magnitude served to control for the effects of magnitude and expected value. Both magnitude and expected value are twice as high at 200 compared with 100 points. Thus, at magnitude = 100 points, expected values varied between 0 and 100 points, at magnitude = 200 points, they varied between 0 and 200 points but probabilities were the same in both conditions. If activations do not differ significantly between the two magnitude levels, then the distortions are primarily driven by probability rather than magnitude or expected value. However, the use of only two magnitude levels in the present design does not allow testing for nonlinearities in the value function and future research is needed to address this issue. The stimuli and the rewarded versus unrewarded outcomes alternated randomly within the boundaries defined by the probabilities (48 trials for p = 1.0; e.g., 36 rewarded and 12 unrewarded trials for p = 0.75), thus producing a measured mean of reward identical to the expected value. Throughout the experiment, the total points accumulated were displayed and updated in rewarded trials at the time of reward delivery. Four percent of the total points were predictably paid out as British pence at the end of the experiment.
The visual stimuli were specific combinations of attributes drawn from two visual dimensions, shape and color, indicating reward magnitude and probability, respectively. For example, two orange circles could predict 200 points with p = 0.5, whereas one dark red circle could predict 100 points with p = 1.0. Both stimuli were associated with different combinations of magnitude and probability but the same expected value (100 points). We counterbalanced the meaning of dimensions (shape or color of stimuli) and the direction in which they changed (for shape, number of circles per stimulus; for color, relative level of yellow or red) across participants. Stimulus delivery was controlled using Cogent 2000 software (Wellcome Department of Imaging Neuroscience, London, United Kingdom) as implemented in Matlab 6.5 (Mathworks).
The conditioning procedure comprised a training and a testing phase. In the training phase, participants learned the meaning of the stimuli and how to perform the task while each stimulus was presented in eight consecutive trials. Earnings in the training phase did not contribute to the monetary earnings of participants, but accumulated points were nevertheless displayed. Participants were in the scanner during the training phase while structural scans were taken. Functional data were acquired in the test phase, split into two halves, each with 24 randomly alternating presentations of each stimulus. The task remained the same as during the training phase, but outcomes contributed to total earnings. In both training and testing phase, stimuli appeared in one of the four quadrants of the screen. The quadrant of stimulus appearance varied randomly between trials. Participants were instructed to press one of four buttons corresponding to the spatial quadrant of stimulus presentation. If they failed to press the correct button within 900 ms, the trial was aborted, a red “X” appeared, and 100 points were subtracted from the accumulated earnings. Error trials were repeated, and reported results correspond to correct trials in the testing phase.
Data acquisition and analysis.
We evaluated ratings statistically by nonparametric Spearman and Kendall rank correlation, thus allowing for nonlinear changes as a function of probability. The proportion of participants showing positive or negative correlation with probability was examined with a one-sample rank test.
We fitted a log in odds function to each participant's rating data from after scanning, averaged across the two magnitudes (Gonzalez and Wu, 1999). As we tested the full probability spectrum only with two magnitudes, we did not estimate the curvature of the value function. The log in odds function captures the probability weighting function w(p) of prospect theory with two parameters: Solving for w(p) yields the following: with δ = exp τ.
In this version of the probability weighting function, γ primarily reflects the curvature of the weighting function whereas δ reflects its elevation. For each participant the best-fitting γ and δ were obtained and used for correlations with neuronal probability distortions. For the three participants with negative correlations between ratings and probability, the fits yielded negative parameters and these participants were excluded from the correlation analysis. The reasons for the negative correlations are unclear. One possibility is that the participants' visual preferences changed independently of reward associations and ratings in these participants were driven by visual preferences. The results reported in the remaining analyses remained significant when these three participants were excluded. We also tested preference of participants for higher probability among two concurrently presented stimuli before and after the experiment.
We acquired gradient echo T2*-weighted echo-planar images (EPIs) with blood-oxygen-level-dependent (BOLD) contrast on a Siemens Sonata 1.5 Tesla scanner (slices/volume, 33; repetition time, 2.97 s). Depending on performance of participants, 405–500 volumes were collected in each half of the experiment, together with five “dummy” volumes at the start of each scanning run. Scan onset times varied randomly relative to stimulus onset times. A T1-weighted structural image was also acquired for each participant. Signal dropout in basal frontal and medial temporal structures resulting from susceptibility artifact was reduced by using a tilted plane of acquisition (30° to the anterior commissure-posterior commissure line, rostral > caudal). Imaging parameters were the following: echo time, 50 ms; field-of-view, 192 mm. The in-plane resolution was 3 × 3 mm, with a slice thickness of 2 mm and an interslice gap of 1 mm. High-resolution T1-weighted structural scans were coregistered to their mean EPIs and averaged together to permit anatomical localization of the functional activations at the group level.
Statistical Parametric Mapping (SPM2; Functional Imaging Laboratory) served to spatially realign functional data, normalize them to a standard EPI template and smooth them using an isometric Gaussian kernel with a full-width at half-maximum of 10 mm. We used a standard rapid event-related fMRI approach in which evoked hemodynamic responses to each trial type are estimated separately by convolving a canonical hemodynamic response function with the onsets for each trial type and regressing these trial regressors against the measured fMRI signal (Dale and Buckner, 1997; Josephs and Henson, 1999). This approach makes use of the fact that the hemodynamic response function summates in an approximately linear manner over time (Boynton et al., 1996). By presenting trials in strictly random order and using randomly varying intertrial intervals, it is possible to separate out fMRI responses to rapidly presented events without waiting for the hemodynamic response to reach baseline after each single trial (Dale and Buckner, 1997; Josephs and Henson, 1999). Functional data were analyzed by constructing a set of stick functions at the stimulus-onset times for each of the 12 trial types in a first model and at outcome-onset times in a second model. Rewarded and unrewarded trial types were modeled separately. The stick function regressors were convolved with a canonical hemodynamic response function (HRF). Participant-specific movement parameters were modeled as covariates of no interest.
The general linear model served to compute trial type-specific betas, reflecting the strength of covariance between the brain activation and the canonical response function for a given condition at each voxel for each participant (Friston et al., 1995). We parsed the entire probability range into four quartiles. To test for an inverted S-shaped probability distortion in the brain, we constructed the following contrast: [(p = 1.0 − p = 0.75) + (p = 0.25 − p = 0.0)] − (p = 0.75 − p = 0.25). This contrast reflects the critical property of the inverted S-shaped function that it increases more in the top and the bottom quartile of the probability range. Conversely, S-shaped probability distortions were tested for with the following contrast: (p = 0.75 − p = 0.25) − [(p = 1.0 − p = 0.75) + (p = 0.25 − p = 0.0)], reflecting the basic property of the S-shaped function with stronger increases in the middle range of the probability range. The effects of interest (betas, percentage of signal change) were calculated relative to an implicit baseline. Using random-effects analysis, the relevant contrasts of parameter estimates (contrast estimates) were entered into a series of t tests, simple regressions or ANOVAs with nonsphericity correction where appropriate. We used whole-brain correction for multiple comparisons [P < 0.05; false discovery rate; throughout, statistical probability (P) is distinguished from reward probability (p) by case]. Reported voxels conform to MNI (Montreal Neurological Institute) coordinate space, with the right side of the image corresponding to the right side of the brain.
The goal of this study was to identify the neuronal basis of nonlinear probability processing (probability distortions) in the prefrontal cortex by using functional magnetic resonance imaging in a simple stimulus-reward association task without choice (Fig. 1 a) (Materials and Methods). Specifically, we asked whether probability distortions in a typical decision brain structure are a result of the decision process or might already occur at earlier stages. The design used the full range of reward probabilities and experienced outcomes. Different stimuli predicted different all-or-none binary probability distributions of reward which varied between p = 0.0 and p = 1.0 in five levels separated by 0.25. Participants viewed one reward-predicting stimulus per trial, experienced outcomes in each trial and learned the probabilities associated with each stimulus before scanning.
We tested preference of participants for higher probability among two concurrently presented stimuli before, and after, the experiment. After the experiment, participants consistently preferred higher probability to lower probability stimuli. For example, 14 of 16 participants preferred the stimulus predicting 100 points with p = 1.0 over the stimulus predicting 100 points with p = 0.0 (p = 0.004, one sample sign test). Thirteen of 16 participants preferred 200 points at p = 0.75 over 200 points at p = 0.25 (p = 0.02). Conversely, there were no significant differences in preference between the stimuli predicting reward at different probabilities before the experiment [p = 0.45 for p(100) = 1.0 vs p(100) = 0.0 and p = 0.80 for p(200) = 0.75 vs p(200) = 0.25]. These data suggest that participants' preferences became sensitive to variations in probability as a result of the experimental conditioning procedure.
We measured the pleasantness of stimuli before and after conditioning. Pleasantness ratings did not vary as a function of probability before but increased with increasing probability after conditioning (before conditioning: median Spearman's r across participants = 0.22; and Kendall's tau = 0.16; after conditioning: Spearman's r and Kendall's tau of median ratings = 1, P < 0.05; median Spearman's r across participants = 0.94; Kendall's tau = 0.88;). After conditioning, the pleasantness ratings of 13 participants correlated positively with probability, those of the remaining 3 negatively (p = 0.02, one sample sign test; p = 0.3 before conditioning). Thus, pleasantness ratings became sensitive to variations in probability as a result of the conditioning procedure.
Modern economic theories capture the shape of nonlinear patterns of probability processing formally by probability-weighting functions (Kahneman and Tversky, 1979; Gonzalez and Wu, 1999). These weighting functions determine the degree to which our probability estimations deviate from linearity with curvature and other parameters. To assess individual distortions in pleasantness ratings we fitted a two-parameter probability weighting function (Gonzalez and Wu, 1999) to each participant's ratings (after conditioning, averaged across the two magnitudes). The two parameters used by this specific weighting function capture the curvature (γ) and the elevation (δ) of the probability-weighting curve (supplemental Fig. S1, available at www.jneurosci.org as supplemental material). The obtained probability weighting functions showed individual variation in γ and δ (Fig. 1 b). Importantly, low γ values reflected overrating of small probabilities and underrating of large probabilities whereas high γ values reflected the opposite pattern. Thus, low and high γ values correspond to inverted S and regular S shape probability distortion curves, respectively. Participants' ratings showed both patterns of probability distortion (Fig. 1 c). Gamma and delta varied independently from each other (supplemental Fig. S2, available at www.jneurosci.org as supplemental material). Regardless of curvature, high δ participants rated probabilities smaller than p = 1.0 higher compared with low δ participants (supplemental Fig. S2, available at www.jneurosci.org as supplemental material).
To search for neuronal distortions in probability processing, we initially looked for inverted S and regular S-shaped probability distortions together in a two-tailed analysis. All the results reported subsequently were significant in this two-tailed test (P < 0.05, whole-brain correction). We then specifically identified brain regions showing probability distortions captured by an inverted S-shape in all participants, regardless of individual probability distortions. An inverted S probability-weighting function results in the weights of p = 0.25 and p = 0.75 being relatively closer together than the weight of p = 0.25 from that of p = 0.0 (overweighting of small probability) and the weight of p = 0.75 from that of p = 1.0 (underweighting of large probability). Accordingly, we used a contrast that tested for stronger activation changes in the bottom and top quartiles of the probability range compared with the two center quartiles. We found a highly significant covariation between modeled and actual brain activation in a region of dorsolateral prefrontal cortex (Fig. 2 a) (z = 4.9; P < 0.05, whole-brain correction). This was the only region for which effects were significant with whole-brain correction.
When evaluating the signal change in this region in greater detail we found a strong activity increase from p = 0.0 to p = 0.25, consistent with overweighting of small probabilities, and a similarly strong activity decrease between p = 1.0 and p = 0.75, consistent with underweighting of large probabilities (Fig. 2 b). To test whether the distortions were related to magnitude or expected value we compared the activations for 100 and 200 points separately at each level of probability. Both magnitude and expected value are twice as high at 200 compared with 100 points. Thus, at magnitude = 100 points, expected values varied between 0 and 100 points, at magnitude = 200 points, they varied between 0 and 200 points but probabilities were the same and varied between p = 0.0 and p = 1.0 in both conditions. We found similar distortions and no significant activation differences between the two magnitude levels (supplemental Fig. S3a, available at www.jneurosci.org as supplemental material) (P > 0.05). The similarity and overlap of activation distortions at different magnitude levels suggests that reward magnitude or value did not play a major role in the generation of the observed results. Rather, distorted activations appear to be primarily driven by probability.
To assess whether stimulus-related activations were confounded by outcome-related activations we plotted rewarded and unrewarded trials separately (although there were no unrewarded trials at p = 1.0 and no rewarded trials at p = 0.0). Activations at intermediate probabilities were similar for rewarded and unrewarded trials, suggesting that these responses were due more to conditioned stimuli rather than outcomes (supplemental Fig. S3b, available at www.jneurosci.org as supplemental material). Thus, activity in the dorsolateral prefrontal cortex follows an inverted S-shape as suggested by prospect theory, but, importantly, this neuronal distortion of probability occurs even in the absence of behavioral choice.
The neuronal probability distortions reported thus far occurred when considering all participants, regardless of behavior. We reasoned that a brain region that encodes reward probability in a distorted manner should also be sensitive to behavioral distortions of individual participants. This was indeed the case. The more the individual probability-weighting functions followed an inverted S (corresponding to low γ) (Fig. 1 b), the better the regression of brain activation with an inverted S-shape in the dorsolateral prefrontal region in each individual (Fig. 2 c) (r = −0.73; p = 0.003). We used two different nonparametric, rank-dependent methods to test whether the correlations between prefrontal activation and subjective probability weighting were outlier-resistant. Both methods revealed a significant correlation (Fig. 2 c) (Spearman's r = −0.61; p = 0.03; Kendall's tau = −0.46; p = 0.03). The correlations occurred only for the distortion parameter γ of the probability weighting function, but not for the elevation parameter, δ (supplemental Figs. S4, S5 for further controls, available at www.jneurosci.org as supplemental material). These data suggest that coding of neuronal probability distortions can explain the magnitude of subjective behavioral distortions; individuals showing stronger neuronal overweighting of small and underweighting of large probabilities also showed similarly stronger behavioral over- and underweighting. The correlations with individual behavior in dorsolateral prefrontal cortex occur in addition to the behavior-independent overweighting of small and underweighting of large probabilities in the same region described above.
Consistent with previous reports (Gigliotti and Sopher, 1993; Harbaugh et al., 2002), our behavioral results indicated that not all individuals weighted probabilities according to an inverted S-shaped function. Instead, some used a regular S-shaped function, corresponding to underweighting of small and overweighting of large probabilities (Fig. 1 b). To identify regions showing S-shaped neuronal probability distortions in all participants, regardless of behavior, we used a contrast that tested for greater activation increases in the two center quartiles of the probability range compared with the bottom and the top quartiles. We found four regions in prefrontal cortex that survived whole-brain correction for multiple comparisons (supplemental Table S1, available at www.jneurosci.org as supplemental material). All regions showed similar activation patterns. A representative ventrolateral prefrontal region is shown in Figure 3 a. Activation in this region increased marginally from p = 0.0 to p = 0.25 and from p = 0.75 to p = 1.0 but considerably from p = 0.25 to p = 0.75 (Fig. 3 b). This pattern of activation corresponds to neuronal underweighting of small and overweighting of large probabilities. There were no significant activation differences between low and high levels of magnitude and between rewarded and unrewarded trials (supplemental Fig. S6, available at www.jneurosci.org as supplemental material). Importantly, the regular S-shaped distortions occurred in the absence of choice, and when testing all participants, regardless of their behavior.
We next asked whether ventrolateral prefrontal activation correlated with individual behavioral probability distortion and found that indeed it did. Furthermore, correlations were outlier-resistant. Individuals with more extreme S-shaped probability-weighting functions (high γ) also showed stronger S-shaped activation in the ventrolateral prefrontal region (Fig. 3 c) (r = 0.72; p = 0.004; Spearman's r = 0.69; P = 0.02; Kendall's tau = 0.51; P = 0.01). There was no correlation between activity in this region and the elevation parameter δ (supplemental Fig. S7, available at www.jneurosci.org as supplemental material).
Recent studies showed that the shape of probability distortions depends on whether probabilistic outcomes are actually experienced or hypothetically described (Kareev et al., 2002; Barron and Erev, 2003; Hertwig et al., 2004; Weber et al., 2004; Weber, 2006). Based on these suggestions we investigated the influence of experience on the shape of the probability-weighting function. We compared the first with the second half of the scanning trials, when participants' experience with the probabilistic outcomes of the task had increased. Indeed we found that the activity in the dorsolateral prefrontal region described in Figure 2 fitted an inverted S-shape better in the first than in the second half of the experiment (Fig. 4 a, left) (F = 16.3, p = 0.002, repeated measures ANOVA). Conversely, activity in the ventrolateral prefrontal regions shown in Figure 3 was well captured by an S-shaped regressor in both the first and the second half of the experiment (Fig. 4 a, right) (F = 0.46, p = 0.51) and, if anything, the fit of the activation with the regressor increased with experience. A significant region by experience interaction was obtained for dorsolateral and ventrolateral prefrontal cortex (F = 4.9, p = 0.03). These data suggest that increasing behavioral experience with probabilistic outcomes preferentially straightens inverted S-shaped rather than regular S-shaped neuronal probability distortions.
The preferential fit of dorsolateral prefrontal activation to an inverted S-shape distortion in the first compared with the second half of the experiment could result from a general decrease in dorsolateral prefrontal responsiveness to probability with increasing experience, i.e., a reduced probability-response slope. We therefore compared the fit to a linear probability function between the two successive halves of the experiment. Contrary to reduced responsiveness, and in agreement with a reduction in inverted S-shape processing, we found a significant fit in dorsolateral prefrontal activation with linear probability in the second half of the experiment (P < 0.05), which was significantly better than in the first half (F = 9.0, P = 0.01). Thus, dorsolateral prefrontal cortex remained responsive to reward probability with continued task experience but probability coding became more linear.
To investigate the mechanism by which experience changes the shape of the neuronal probability distortion, we reasoned that the surprising occurrence of reward or no reward might provide an occasion for learning, as suggested by formal learning theory (Rescorla and Wagner, 1972; Mackintosh, 1975). The essence of reward learning, as captured by formal learning theories, is to adjust predictions of reward according to the differences between what we have learned so far and what we currently experience in terms of reward outcomes (prediction errors). The inverted S-shape distortions of dorsolateral prefrontal cortex correspond to an overprediction of reward for low probability and underprediction of reward for high probability stimuli. Undistortion of an inverted S-shape can be induced by strong prediction error responses from the experienced omission of reward after low probability and from experienced reward occurrence after high probability stimuli. This response pattern should be expressed in the first half of the experiment, when probability representations are still more distorted in dorsolateral prefrontal cortex compared with the second half of the experiment. We analyzed outcome responses in a separate general linear model and found stronger dorsolateral prefrontal activation to reward absence in the first than the second half of the experiment with p = 0.25 (Fig. 4 b, left) (F = 9.7, P = 0.008, repeated measures ANOVA). Conversely, responses to reward delivery were stronger in the first than the second half of the experiment with p = 0.75 (Fig. 4 b, right) (F = 17.9, P = 0.001). Accordingly, there was a significant interaction of condition, outcome and experiment half in dorsolateral prefrontal cortex (F = 4.9, P = 0.03). These data suggest that the outcome responses in dorsolateral prefrontal cortex may correspond to reward prediction errors, which develop over the course of the experiment. Such development of prediction error coding may in turn contribute to undistorting neuronal representations of reward probability by experience.
The decrease in dorsolateral activation to reward omission with p = 0.25 and reward delivery with p = 0.75 could reflect a general decline in responding to outcomes with increasing experience. To test for this possibility we analyzed activations induced by the complimentary outcomes (reward delivery with p = 0.25 and reward omission with p = 0.75). For these outcomes, there were no significant differences between the two halves of the experiment (reward delivery at p = 0.25, p = 0.13; reward omission at p = 0.75, p = 0.96), indicating that there was no evidence of a change in responsiveness. The significant outcome related activations at p = 0.5 remained unchanged between the first and second half of the experiment (reward delivery, p = 0.13; reward omission, p = 0.59), compatible with unchanged processing and prediction error coding at p = 0.5. These data suggest that outcome response reductions at p = 0.25 (rewarded) and p = 0.75 (unrewarded) are not indicative of a general decrement in responsiveness with increasing experience in dorsolateral prefrontal cortex.
In a final step, we investigated whether there would be regions that code probability in a nondistorted manner or regions that showed a threshold effect by responding only to high probabilities. We used a regression model that tested for linear increases with probability and found a significant covariation between modeled and actual responses in the striatum (Fig. 5 a,b). The fit was significantly better with the linear than with the distorted contrasts (P < 0.05). The probability-related striatal activation increases did not correlate with behavioral probability distortions (Fig. 5 c). Thus, activity in the striatum increases linearly with probability in the absence of choice. At the chosen threshold (P < 0.05, whole-brain corrected), there were no regions that showed a threshold effect by responding exclusively to p = 0.75 or p 1.0.
This study provides evidence for neuronal distortions of reward probabilities in prefrontal cortex and veridical reward probability processing in the striatum. The specific distortion pattern proposed by prospect theory, with overweighting of small and underweighting of large probabilities, is expressed in the activity of dorsolateral prefrontal cortex. Conversely, underweighting of small and overweighting of large probabilities is expressed in ventral parts of the prefrontal cortex. Both of these effects occur when all individuals are tested, regardless of behavior. Prospect theory uses probability distortions to explain preference reversals (Allais-paradox) (Allais, 1953) when people choose between high or low probability options (Kahneman and Tversky, 1979). A combined underweighting of large and overweighting of small probabilities by the dorsolateral prefrontal cortex could provide an anatomical basis for such preference reversals.
Our procedure differed from standard procedures of behavioral economics in that we used conditioning rather than choice, visual rather than alphanumeric stimuli, immediate rather than no or delayed feedback and repeated experience rather than hypothetical descriptions. We deliberately avoided choice situations to study the input stage of the choice process. It remains to be seen whether the distortions also occur with overt choices between feedback-free numeric options or whether they are specific to our behavioral situation.
Our results concur with the notion that people differ in how exactly their probability processing deviates from linearity. We found a relatively high proportion of individuals showing experience-compatible regular S-distortions rather than description-compatible inverted S-distortions. However, behavior reflecting regular S-distortions is not uncommon even with hypothetical decisions in the absence of experience with probabilistic outcomes (e.g., 17–28% in Kahneman and Tversky, 1979; Gigliotti and Sopher, 1993; Harbaugh et al., 2002). The relatively high proportion of regular S-distortions and substantial experience with probabilistic outcomes in our study agrees well with the observation that increasing experience increases the propensity of regular S-distortions (Kareev et al., 2002; Barron and Erev, 2003; Hertwig et al., 2004; Weber et al., 2004; Weber, 2006).
Previous neuroimaging research showed that prefrontal regions are activated by reward magnitude (Knutson et al., 2005; Yacubian et al., 2006). The absence of significant activation differences between the two levels of magnitude used here suggests that the observed distortions were primarily driven by reward probability rather than magnitude (see supplemental discussion, available at www.jneurosci.org as supplemental material, for relation of present work to previous studies of probability distortion (Paulus and Frank, 2006; Berns et al., 2008) and to lesion studies (Bechara et al., 2000; Corbit and Balleine, 2003; Clark et al., 2004; Hornak et al., 2004; Fellows and Farah, 2005; Floden et al., 2008).
Distortions in brain responses to probability emerged presently with a direct test (contrast) for such distortions in event-related brain activation to reward-predicting stimuli. Thus, our results occurred regardless of our behavioral definition of probability distortion. Only in a second step did we search for correlations of activation distortions with individual differences in pleasantness ratings. The presence of correlations in the same regions identified with the direct test may suggest that differences in pleasantness ratings indeed captured differences in probability distortions. An alternative interpretation would be that activations reflect encoding of specific visual stimuli for future recall. However, we varied the assignment of visual stimuli to probability conditions across participants, and we found no significant activations in typical visual memory encoding structures such as left ventrolateral prefrontal and temporal cortex (Wagner et al., 1999; Prince et al., 2005). Moreover, the differential effect of experience on inverted S and regular S-distortions is difficult to reconcile with a recall of specific visual inputs account but fits well with previous behavioral research on probability distortion (Hertwig et al., 2004; Weber et al., 2004).
The present results are fully compatible with the possibility that participants encoded reward probability, possibly for future use. In this view, the encoding or generation of decision variables such as probability could follow principles proposed by learning theory. Encoding of decision variables could consist of integration over a series of positive and negative prediction errors and assignment of the ensuing value to predictive stimuli (Rescorla and Wagner, 1972). Experience and memory factors may then lead to the distortion of probability. For example, recent experiences may receive more weight than less recent experiences (Hogarth and Einhorn, 1992). Because low probability outcomes are rare, they are less likely to have occurred recently than more probable events. Thus, underweighting of small probabilities may result from recency effects (Hertwig et al., 2004). For translation into decision making during overt choices, some of these decision variables may be combined, e.g., reward probability and magnitude. Such combinations could then be ordered for the various choice options to allow choice of the option with the highest objective or subjective value.
Our results both endorse a prediction and learning-based account of probability distortions as well as expand on previous work implicating prefrontal cortex in learning (Bichot et al., 1996; Rushworth et al., 1997; Assad et al., 1998; Gehring and Knight, 2000; Corlett et al., 2004). The inverted S-distortion in dorsolateral prefrontal activation is consistent with reward overprediction for low and underprediction for high probability stimuli. Learning theories suggest that we learn about rewards whenever our predictions fail to match outcomes (prediction error; Rescorla and Wagner, 1972; Mackintosh, 1975). In agreement with this idea, dorsolateral prefrontal activation decreased with experience specifically to reward omission after low probability stimuli and to reward delivery after high probability stimuli. Thus, our findings point to a role of dorsolateral prefrontal cortex in outcome-driven plasticity of probability-weighting functions.
The current study elucidates further a role for prefrontal cortex in decision-making by revealing neuronal probability distortions without overt choice. In tasks not requiring choice activity of lateral prefrontal neurons codes reward quality and probability and shows a preferential relation to reward rather than punishment (Watanabe, 1996; Kobayashi et al., 2002, 2006). Although prefrontal neurons are also active in choice situations (Kim and Shadlen, 1999; Barraclough et al., 2004), it is currently unknown whether this activity is specifically choice-related. Together, the data suggest a major role of lateral prefrontal cortex in coding fundamental reward decision variables.
Neuroimaging research suggests a substantial involvement of the left superior frontal sulcus in perceptual decisions (Heekeren et al., 2004, 2006). Here, we show that a region in close proximity to that lateral prefrontal region showed inverted S-distortions of reward probability [peak activation in Heekeren et al. (2004): −24/24/36; peak here: −32/16/34]. It is essential for a decision structure to have access to such a fundamental reward parameter as probability. Our results suggest the intriguing possibility that decisions performed by dorsolateral prefrontal cortex could be biased by a reward probability signal that overemphasizes small and underemphasizes large probabilities.
Inverted S-distortions may reflect primarily a cognitive response to verbal descriptions of probability, whereas regular S-distortions arise from probabilistic experience with affective outcomes (Kareev et al., 2002; Barron and Erev, 2003; Hertwig et al., 2004; Weber et al., 2004; Weber, 2006). Dissociations of cognitive and affective processes map onto dorsal and ventral parts of prefrontal cortex (Fuster, 1997; Duncan and Owen, 2000). Furthermore, ventral regions receive relatively stronger sensory inputs than dorsal regions (Pandya and Yeterian, 1998). The finding of inverted S-coding in dorsolateral, and regular S-coding in ventrolateral prefrontal cortex, reflects this dorsal-ventral scheme of prefrontal organization.
The current results suggest that lateral prefrontal cortex is particularly sensitive to experienced reward. People usually infer probabilities from repeated experience. The lateral prefrontal cortex might be well suited to do so because of its representation of previous decisions and outcomes (Barraclough et al., 2004). Conversely, reward magnitude and delay become apparent with less experience and are processed preferentially by more medial and orbitofrontal regions (Wallis and Miller, 2003; Padoa-Schioppa and Assad, 2006; Roesch et al., 2006; Rudebeck et al., 2006). The lateral prefrontal cortex is also particularly sensitive to individual differences in the processing of ambiguity (Huettel et al., 2006) and probability (present). Conversely, ventromedial regions are more sensitive to individual differences in delay processing (Kable and Glimcher, 2007). Together, the present and previous data suggest the intriguing possibility of medio-lateral dissociations in prefrontal processing of economic decision parameters.
The distorted probability signals we report in prefrontal cortex contrasted with linear signals in the striatum. Thus, the current data confirm and extend previous reports of linear probability processing in the striatum (Abler et al., 2006; Preuschoff et al., 2006; Tobler et al., 2007). The linear probability signals in the striatum did not correlate with behavioral probability distortions, whereas the distorted prefrontal signals did. Individual differences in ambiguity and risk attitude seem to be expressed primarily in prefrontal rather than striatal activity (Huettel et al., 2006; Tobler et al., 2007). Together, these data suggest that the striatum shows more veridicality and less sensitivity to individual differences than the prefrontal cortex when processing reward probability.
The co-occurrence of linear and distorted probability signals in the brain has theoretical relevance. To determine the value of choice options, traditional economic concepts (expected utility and expected value) employ linear probability terms whereas more recent concepts (prospect) employ a distorted probability term (Von Neumann and Morgenstern, 1944; Kahneman and Tversky, 1979) (see Materials and Methods). Thus, the data suggest that the linear striatal probability coding could form the basis of expected utility or expected value signals whereas the distorted dorsolateral prefrontal coding could form the basis of prospect-like signals.
Sensory illusions could pose an analogy with behavioral preference reversals because subjective behavior fails to reflect objective inputs in both cases (Kahneman and Tversky, 1979). Sensory illusions and preference reversals could arise from distortions in visual information and reward probability processing, respectively. As with sensory illusions (Chen et al., 2003; Jazayeri and Movshon, 2007), our results suggest that reward probability distortions can occur at different processing stages. In this scheme, the striatum would correspond more to the retina with veridical probability coding whereas prefrontal cortex would correspond more to higher-order sensory regions with distorted signals. The fact that signals in the striatum encode probability linearly suggests that the incorporation of distortions occurs downstream of the divergence of pathways leading to the striatum and prefrontal cortex (including in the prefrontal cortex itself). Nonetheless, it is possible that more sensory regions may process or store distorted information as well, as is the case with contextual visual illusions (Murray et al., 2006). More importantly, by revealing neuronal probability distortions at a neuronal encoding stage of processing before actual choice behavior, the present data show how neuroimaging data inform and expand predictions derived from behavioral theory.
This work was supported by the Wellcome Trust, the Swiss National Science Foundation, the Roche Research Foundation, and the Greek Government Scholarship Foundation. R.J.D. and W.S. are supported by Wellcome Trust Programme grants and W.S. by a Wellcome Trust Principal Research Fellowship. We thank Chris Burke and Martin Kocher for comments.
- Correspondence should be addressed to Dr. Philippe N. Tobler, Department of Physiology, Development and Neuroscience, University of Cambridge, Downing Street, Cambridge CB2 3DY, United Kingdom.