## Abstract

Decisions we face in real life are inherently risky and can result in one of many possible outcomes. However, most of what we know about choice under risk is based on studies that use options with only two possible outcomes (simple gambles), so it remains unclear how the brain constructs reward values for more complex risky options faced in real life. To address this question, we combined experimental and modeling approaches to examine choice between pairs of simple gambles and pairs of three-outcome gambles in male and female human subjects. We found that subjects evaluated individual outcomes of three-outcome gambles by multiplying functions of reward magnitude and probability. To construct the overall value of each gamble, however, most subjects differentially weighted possible outcomes based on either reward magnitude or probability. These results reveal a novel dissociation between how reward information is processed when evaluating complex gambles: valuation of each outcome is based on a combination of reward information whereas weighting of possible outcomes mainly relies on a single piece of reward information. We show that differential weighting of possible outcomes could enable subjects to make decisions more easily and quickly. Together, our findings reveal a plausible mechanism for how salience, in terms of possible reward magnitude or probability, can influence the construction of subjective values for complex gambles. They also point to separable neural mechanisms for how reward value controls choice and attention to allow for more adaptive decision making under risk.

**SIGNIFICANCE STATEMENT** Real-life decisions are inherently risky and can result in one of many possible outcomes, but how does the brain integrate information from all these outcomes to make decisions? To address this question, we examined choice between pairs of gambles with multiple outcomes using various computational models. We found that subjects evaluated individual outcomes by multiplying functions of reward magnitude and probability. To construct the overall value of each gamble, however, they differentially weighted possible outcomes based on either reward magnitude or probability. By doing so, they were able to make decisions more easily and quickly. Our findings illustrate how salience, in terms of possible reward magnitude or probability, can influence the construction of subjective values for more adaptive choice.

## Introduction

Every decision we make entails some degree of risk and uncertainty and can result in one of many possible outcomes. For example, when choosing which restaurant to go to for lunch, one needs to consider many factors such as commute time, pricing, and wait time, each of which could vary depending on traffic, food availability, and the number of other customers who have also chosen that restaurant. To compute the overall values for such complex options or to directly compare those options, the brain has to assign a value to each possible or relevant outcome based on reward information (e.g., expected reward and probability of a given outcome) followed by integration or direct comparison of those values. Any of these processes can be very daunting when there are multiple pieces of reward information and many possible outcomes. Therefore, to enable decision-making between real-world options, the brain must rely on certain mechanisms that simplify these valuation and choice processes to reduce mental effort and to make value-based decision making more adaptive (Payne et al., 1988).

Although prospect theory (Kahneman and Tversky, 1979), the standard model of choice under risk, has been very successful in capturing many aspects of choice (Wu and Gonzalez, 1996; Birnbaum and Navarrete, 1998; Gonzalez and Wu, 1999; Abdellaoui, 2000; Bruhin et al., 2010; Glöckner and Pachur, 2012), it fails to account for choice between gambles with more than two alternative outcomes (complex gambles). As a result, various models have been proposed to tackle valuation and choice between complex gambles, including cumulative prospect theory (Tversky and Kahneman, 1992), transfer of attention exchange (Birnbaum and Navarrete, 1998; Birnbaum, 2008), decision field theory (Busemeyer and Townsend, 1993), and salience theory of choice (Bordalo et al., 2012, 2013). Interestingly, most of these models use a “rank-dependent” strategy similar to what is proposed in cumulative prospect theory. This strategy assumes that possible outcomes are ranked based on different variables, such as reward probability, and the ranking, in turn, determines the influence of each outcome on the overall value via different (model-dependent) mechanisms. Despite the success of these models in capturing an overall pattern of choice between complex gambles, it is still unclear how mechanisms proposed in these models can be instantiated in the brain given the complicated computations required by such models.

Here, we used a combination of experimental and modeling approaches to reveal plausible neural mechanisms underlying valuation and choice between complex gambles. Considering its role in selection between multiple sources of information for further processing (Wolfe and Horowitz, 2004; Carrasco, 2011), we hypothesized that attention is also involved in the evaluation and choice between gambles with multiple outcomes. To test our hypothesis, we performed an experiment in which human subjects selected between pairs of gambles with only one non-zero outcome (simple gambles) and pairs of gambles with three non-zero outcomes (complex gambles). Furthermore, we developed a large family of models to capture the observed choice behavior. In these models, the value of a complex gamble was constructed by differentially weighting the value of its possible outcomes via a simple attentional mechanism. More specifically, attention could be guided by different types of reward information (e.g., reward magnitude, reward probability, expected value, etc.) to allow for differential weighting of possible outcomes. This essential feature of our model enabled evaluation of individual outcomes and their weighting to rely on different pieces of reward information (e.g., evaluation based on expected value but differential weighting based on magnitude). We fit subjects' choice behavior with our and competing models to assess our models' ability in capturing choice behavior and to address two key questions regarding the construction of reward value for complex gambles. First, how are individual outcomes of a complex gamble evaluated? Second, how are possible outcomes compared between two complex gambles, or equivalently, how are the values of possible outcomes combined to compute the overall value of a complex gamble?

## Materials and Methods

##### Subjects.

A total of 64 human subjects (38 females) were recruited from the Dartmouth College undergraduate student population. Subjects were compensated with money and/or “t-points”, which were extra-credit points for introductory classes in the department of Psychological and Brain Sciences at Dartmouth College. The base rate for compensation was $10/h or 1 t-point/h. All subjects were then additionally rewarded based on their performance, up to $15/h. This additional performance-based compensation was always monetary. None of the subjects was excluded from our final data analyses. All experimental procedures were approved by the Dartmouth College Institutional Review Board, and informed consent was obtained from all subjects before participating in the experiment.

##### Experimental design.

This study used a within-subject design. In two experimental sessions, each subject performed two tasks (simple-gamble and complex-gamble tasks) in which he/she selected between a pair of gambles on every trial and was provided with feedback about the outcome of the chosen gamble. In both tasks, gambles were presented as rectangular bars divided into different portions. A portion's color indicated the reward magnitude of that outcome, and its size signaled its probability (see Fig. 1*A–B*). This gamble presentation was adopted from a recent study in monkeys (Strait et al., 2014) because this design allowed us to accurately present complex gambles without using any numbers, making evaluation and decision making more intuitive. For both tasks, subjects were instructed to maximize their reward points, which later translated to monetary reward or t-points, by choosing the gamble that they believed was more likely to provide more reward points. The selected gamble was resolved following each choice according to probabilities associated with possible outcomes of the chosen gamble.

Before the beginning of each task, subjects completed a training session in which they selected between two sure gambles. These training sessions were used to familiarize subjects with the associations between four different colors (purple, magenta, green, and gray) and their corresponding reward values, which depended on the task. In both training sessions, all subjects selected the gamble with higher expected value (EV) on >70% of the trials, indicating that they understood the color–reward associations. For the simple-gamble task, reward values were always 0, 1, 2, and 4 points. For the complex-gamble task, 0 and 1 were used, but the other two reward values varied for each subject depending on their subjective utility (see Complex-Gamble Task). No reward (0 points) was always assigned to the gray color. The color–reward assignment remained consistent for each subject throughout both the training session and its corresponding task. The color–reward assignments, however, were randomized between subjects.

##### Simple-gamble task.

The simple-gamble task consists of two types of trials: (1) choice between a sure option and a simple gamble with two outcomes of either a reward larger than that of the sure option or no reward with complementary probabilities, and (2) choice between two simple gambles (Fig. 1*A*). Reward magnitude and probability were represented by the color and length of the corresponding portion, respectively. Subjects evaluated a total of 63 unique gamble pairs, each of which was shown four times in a random order (total of 252 trials). To make this choice task nontrivial, the gamble pairs were constructed to be relatively similar in expected value.

##### Complex-gamble task.

During the complex-gamble task, subjects selected between 70 unique pairs of gambles with three possible non-zero outcomes. Each gamble was presented four times in a random order (total of 280 trials). The complex-gamble task was similar to the simple-gamble task with a few exceptions to increase the sensitivity of our experimental paradigm (Fig. 1*B*). More specifically, we constructed gambles with almost equal subjective values for each subject by tailoring the reward magnitudes and probabilities of individual gamble outcomes to each subject. First, the middle and large reward values were tailored for each subject according to their utility function estimated from their choice behavior in the simple-gamble task. No reward (0 points) and the small reward (1 point) remained unchanged from the simple-gamble task, whereas the middle and large magnitudes were adjusted to have approximately double and quadruple the small reward's utility, respectively. We kept the maximum value of reward magnitude at 10, resulting in the medians of 3 and 8 for the middle and large rewards, respectively. Although reward magnitudes associated with each color might differ between the simple-gamble and complex-gamble tasks for individual subjects, the relative order of color–reward association did not. For example, the largest reward value in the complex-gamble task would be associated with the color previously corresponding to 4 points in the simple-gamble task, etc.

To construct pairs of three-outcome gambles that are very close in subjective utility for each subject, we also adjusted probabilities of alternative outcomes based on the subject's estimated utility and probability weighting functions from the simple-gamble task. More specifically, the combination of outcome probabilities for each gamble was selected from one of the following sets: {0.6, 0.2, 0.2}, {0.4, 0.3, 0.3}, {0.5, 0.3, 0.2}, and {0.4, 0.4, 0.2}. From the set of all possible gambles, we then picked pairs of gambles for which the difference in the subjective values was less than the 5% of the difference between the maximum and minimum subjective values of all possible pairs. We required the probabilities of the three outcomes to differ from one another by a value larger than or equal to 0.1 to ensure that differences in probabilities were easily visually discernable. Our procedure and the large set of possible pairs guaranteed that there was no correlation between reward magnitude and probability of each gamble outcome (Pearson correlation: *r* = 0.01, *p* = 0.78; Spearman correlation, *r* = 0.026, *p* = 0.33). Finally, we also included 20 “catch” trials in which one of the gambles was better than the other with respect to both reward magnitude and probability. Catch trials were included to determine whether subjects were attentive to the presented gambles or not.

##### Modeling valuation and choice between simple gambles.

We fit choice behavior of individual subjects during the simple-gamble task using four quantities to assign value to the only non-zero gamble outcome. These include: expected value (EV = *m* × *p*); expected value with the probability weighting function [EV with *w*(*p*) = *m* × *w*(*p*)]; expected utility [EU = *u*(*m*) × *p*); and subjective utility (SU = *u*(*m*) × *w*(*p*)]. We considered a power law for the utility function:
where ρ > 0 measures the curvature of the utility function. The probability weighting function (subjective probability) was modeled using the one-parameter Prelec function (Prelec, 1998):
where γ > 0 measures the distortion of the probability weighting function. Finally, using reward values assigned to the two gambles on each trial, the probability that the subject would choose the gamble on the right (*P*_{R}) was computed as follows:
where *V _{R}* and

*V*denote the value of the right and left gambles, respectively, and σ is a model parameter that measures stochasticity in choice by transforming the difference in reward values to the probability of choice.

_{L}##### Modeling valuation and choice between complex gambles.

We used a large family of models to fit choice behavior during the complex-gamble task. Our models differed in their assumptions about how individual outcomes of a given gamble are evaluated (8 possible quantities also referred to as strategies; Fig. 1*C*), how individual outcome values are sorted (6 possible quantities also referred to as strategies), and how the sorted outcomes are combined to form the overall value of a three-outcome gamble (5 possible strategies; Fig. 1*D*). Reward values assigned to the two gambles on each trial were used to determine the probability of selecting between the two gambles based on Equation 3. More formally, the value of an option *X*, *V _{X}*, depends on sorting, evaluation, and weighting strategies as follows:
where

*V*

_{<eval>,X}

^{k}is the value of outcome

*k*of option

*X*,

*DW*

_{<startegy#>,X}

^{I<sort>,Xk}is the weight of outcome

*k*of option

*X*based on sorting index

*I*

_{<sort>,X}

^{k}and the weighting strategy.

*V*

_{<eval>,X}

^{k}can be equal to magnitude (

*m*), utility [

*u*(

*m*)], probability (

*p*), or weighted probability [

*w*(

*p*)], which we collectively refer to “single-attribute” evaluation strategies.

*V*

_{<eval>,X}

^{k}can also be equal to expected value (

*m*×

*p*), expected utility [

*u*(

*m*) × (

*p*)], expected value with weighted probability [

*m*×

*w*(

*p*)], or subjective utility [

*u*(

*m*) ×

*w*(

*p*)], which we collectively refer to as “combined-attribute” evaluation strategies. Sorting index for each outcome,

*I*

_{<sort>,X}

^{k}∈ {

*b*:

*best*,

*m*:

*middle*,

*w*:

*worst*}, is computed after sorting outcomes based on a given sorting quantity: Note that sorting outcomes based on

*m*and

*u*(

*m*), and for

*p*and

*w*(

*p*) would be identical, reducing the number of sorting strategies to six.

Finally, *DW*_{<startegy#>,X}^{I<sort>,Xk} determines the weight for each outcome based on the sorting index and one of five possible weighting strategies (Fig. 1*D*). Weighting Strategy 1 equally weights the three possible outcomes, *DW* = , where β* _{b}*, β

*, and β*

_{m}*denote the weight assigned to the best, middle, and worst outcome, respectively. Strategy 2 assigns equal weights to the best and middle outcomes and ignores the worst outcome,*

_{w}*DW*= . Strategy 3 uses a single parameter to distribute weights between the best and middle outcomes while assigning zero weight to the worst outcome,

*DW*= [β

*, (1 − β*

_{b}*), 0]. Strategy 4 freely distributes weights to the three outcomes using two parameters,*

_{b}*DW*= [β

*, 1 − (β*

_{b}*+ β*

_{b}*), β*

_{w}*]. Strategy 5 is similar to Strategy 4 but uses a separate parameter for distributing weight when there is a zero-magnitude outcome. We also tested an additional strategy with equal weighting for middle and worst outcomes. The results for this strategy are not reported because this strategy was not able to successfully capture the choice data of any subject. Finally, we note that although weighting Strategies 1, 2, and 3 can be considered as special cases of Strategy 4, we tested these strategies since they have fewer parameters, which could result in a better goodness-of-fit when the number of parameters were considered (see Model comparison and data analysis). All the possible combinations of strategies resulted in the generation of 240 alternative models (8 possible strategies for evaluating individual outcomes, 6 for sorting outcomes, and 5 for weighting).*

_{w}##### Competing models of valuation and choice between complex gambles.

To compare our model with existing models of valuation and choice between complex gambles, we fit choice behavior of our subjects during the complex-gamble task using four different rank-dependent models. This includes cumulative prospect theory (CPT; Tversky and Kahneman, 1992), transfer of attention exchange (TAX; Birnbaum and Navarrete, 1998; Birnbaum, 2008), salience theory of choice (STC; Bordalo et al., 2012, 2013), and decision field theory (DFT; Busemeyer and Townsend, 1993). In the following sections, we provide a summary of these models and how they are implemented.

##### Cumulative prospect theory.

CPT generalizes prospect theory for choice under uncertainty by extending PT in multiple ways. Importantly, by adopting a cumulative representation for probability, CPT can be applied to gambles with more than two non-zero outcomes and removes the need for the editing rules of combination and dominance detection. More specifically, for gambles with strictly non-negative outcomes (*m*_{1} ≥ *m*_{2} ≥ … ≥ *m _{n}* ≥ 0), the utility of gamble

*G*,

*CPT(G)*, is equal to: where

*W*(

*P*) is the cumulative weighting function of probability of receiving more than

_{i}*m*(

_{i}*P*= ∑

_{i}_{j = 1}

^{i}

*p*), with boundary conditions of

_{j}*W*(0) = 0 and

*W*(1) = 1. We used the following form of the cumulative weighting function: where γ is a free parameter. Values of γ > 1 and γ < 1 correspond to S-shaped and inverse-S-shaped curves, respectively.

##### Transfer of attention exchange.

In TAX, the utility of a gamble is equal to a weighted average of the utilities of the outcomes (Birnbaum and Navarrete, 1998; Birnbaum, 2008). These weights, referred to as “configural” weights, depend on the probability and rank of the outcome branches and therefore, the relationship between branches. This model assumes that a decision-maker deliberates by attending to possible outcomes of an action depending on their risk attitude. Not only can branches leading to larger rewards attract more attention but branches leading to lower-value outcomes can also attract greater attention if a person is risk-averse. Importantly, the weights of branches result from transfers of attention from one branch to another. If there were no configural effects, then each branch would have weights purely as a function of cumulative outcome probability, *W*(*P*). However, depending on the subject's point of view (i.e., risk attitude), weight is transferred from branch to branch. For example, for a risk-averse subject, weight can be transferred from a higher-value branch *k* to a lower-value branch *i* (*m _{k}* ≥

*m*). If ω(

_{i}*p*,

_{i}*p*,

_{k}*n*) represents the weight transferred from branch

*k*to branch

*i*, the value of gamble

*G*in TAX can then be written as follows: where

*t*(.) is a monotonic function of probability and ω(.) is equal to: Indicating that the weight transferred is a fixed proportion of the weight of the branch giving up the weight. This formulation represent a general case, but assuming lower-value branches receive greater weights (δ >

*0*), a special TAX model (Birnbaum, 2008) can be written for the value of three-outcomes gambles,

*G*= (

*m*

_{1},

*p*

_{1};

*m*

_{2},

*p*

_{2};

*m*

_{3},

*p*

_{3}), where

*m*

_{1}≥

*m*

_{2}≥

*m*

_{3}≥ 0, as follows: where where

*w*(.) is the probability weighting function (Eq. 2).

##### Salience theory of choice.

In STC, the decision-maker's attention is drawn to (precisely defined) salient payoffs (Bordalo et al., 2012, 2013). This leads the decision-maker to a context-dependent representation of gambles in which true probabilities are replaced by decision weights distorted in favor of salient payoffs. By specifying decision weights as a function of payoffs, STC provides a unified account of many empirical phenomena, including frequent risk-seeking behavior, invariance failures such as the Allais paradox, and preference reversals. The value of a gamble in STC is computed by weighting possible outcomes based on their salience as follows [for gambles with three outcomes, *G* = (*m*_{1}, *p*_{1}; *m*_{2}, *p*_{2}; *m*_{3}, *p*_{3})]:
where *W*(.) is the cumulative weighting function (Eq. 7) and ω* _{i}^{l}* is “salient weight” for outcome

*i*of gamble

*l*. The salient weight for each outcome is computed using the salient ranking of each outcome: where δ ∈ (0,1] is a free parameter and

*r*∈ {1, …,|

_{i}^{l}*I*|} is the salient ranking of outcome

*i*in gamble

*G*(lower

_{l}*r*indicates higher salience). Given two outcomes of gamble

_{i}^{l}*G*,

_{l}*i*and

*ı̃*∈

*I*, outcome

*i*is considered to be more salient than outcome

*ı̃*if σ(

*m*,

_{i}^{l}*m*) > σ(

_{i}^{k}*m*,

_{ı̃}^{l}*m*), where σ(

_{ı̃}^{k}*m*,

_{i}^{l}*m*) is the function measuring the saliency based on the outcome magnitude in the two competing gambles

_{i}^{k}*l*and

*k*: and θ > 0 is a free parameter.

##### Decision field theory.

Busemeyer and Townsend (1993) derive DFT from an intuitive but sophisticated computational logic. Suppose that a decision maker attempts to choose according to rank-dependent values of alternative gambles (such as those given by CPT) but does not have an algorithm for effortlessly and quickly multiplying utilities and weights together. The decision-maker could instead proceed by sampling the possible utilities of options in proportion to their decision weights, computing the running sums of these sampled utilities for each option, and stopping (and choose) when the difference between the sums exceeds some threshold determined by the cost of sampling. Considering this algorithm, the probability of choosing the right option based on the difference in values of the right and left options can be simplified as follows:
where *F* is the sigmoid function [*F*(*x*,σ) = 1/(1 + exp(− *x*/σ))], and
*W*(.) is the cumulative weighting function (Eq. 7).

##### Model comparison and data analysis.

We fit choice data using our models and the aforementioned competing models by minimizing the negative log likelihood of the predicted choice probability given different model parameters. Minimization was done using the fminsearch function in MATLAB (MathWorks) over 50 initial model parameters. For model comparison and selection, we used the Akaike information criterion (AIC) or the Bayesian information criterion (BIC) to account for the different number of parameters in different models. Although the minimum AIC (and BIC) values varied across subjects, reflecting differences in how each subject's choice behavior could be best captured, all model comparisons were performed using AIC or BIC values within each subject, making our statistical tests more robust. We also examined the quality of model selection using Vuong's test (Vuong, 1989; see below). We note that because of the small number of trials (280) and the large number of gamble pairs (70), it is impossible to perform cross validation on data from individual subjects. Additionally, cross validation across subjects is futile because of individual variability and differences in risk attitude.

The fitting of choice behavior in the simple-gamble task allowed us to estimate the utility and probability weighting function for individual subjects. To identify the strategies used for sorting, evaluating, and weighting in the complex-gamble task, we computed and compared the goodness-of-fit (AIC and BIC) across all models to find the best overall model. The best model was then used to determine the strategies used for sorting, evaluating, and weighting. In other words, we did not find the best average model across each strategy dimension and instead, found the best overall model across all strategy dimensions.

To further examine the quality of model comparison, we also used Vuong's test (Vuong, 1989) for model selection. Specifically, Vuong's test determines the best model as the model with the log likelihood (LL) significantly smaller than the one of the second best model. Considering *N* samples of LL values for each model, Vuong statistic for comparing models *i* and *j* is calculated using the following equation:
where *LLR* is the sum of LL ratios for the two models, *C _{ij}* is a correction term for the difference of degrees of freedom between two models,

*V*is the variance of LL ratios between two models, and

*K*and

_{i}*K*are the numbers of parameters in models

_{j}*i*and

*j*, respectively. It has been shown that Vuong statistic follows a standard normal distribution

*N*(0,1). As a result, model

*i*can be considered better than

*j*if Vuong statistic > 1.96 and vice versa.

Finally, to quantify how easily a subject can distinguish between a pair of gambles based on their subjective values, we defined “discriminability” as follows:
where *V*_{R} and *V*_{L} indicate the value of the left and right gambles, respectively, *P*_{R} is the probability of selecting the right gamble, and the sum is computed over all unique pairs of gambles. The chance level of discriminability is equal to 0.5.

##### Statistical analysis.

We fit choice data using different models to find the best model (using AIC and/or BIC) and estimating model parameters, and moreover, used Vuong's test to ensure the quality of model comparison (for more details, see Model comparison and data analysis). Model parameters were compared using two-sided sign test. We used Pearson's χ^{2} test of homogeneity to compare ratio of subjects that adopted different types of strategies. For correlation analyses, we used both Pearson and Spearman correlations. For all tests, *p* < 0.05 was considered statistically significant. We compared discriminability and changes in subjective values due to differential weighting within individual subjects. We also had a between-subject comparison for the measure of discriminability between simple- and complex-gamble tasks.

##### Model recovery.

To test the ability of our fitting procedure in capturing the proposed mechanisms for the construction of overall reward value and extracting the correct parameters, we simulated choice data using the proposed models over a wide range of model parameters for the same complex-gamble task performed by the subjects. We then fit the simulated data with each of these models to compute the goodness-of-fit (in terms of AIC) and estimate the original model parameters used to simulate the data. Given the large number of possible models (240), we only considered the most frequent models identified in the experimental data for this analysis. More specifically, we only considered three parameters for sorting the values of individual outcomes (*m*, *p*, and EV) because sorting based on the other five quantities [*u*(*m*), *w*(*p*), EU, EV with *w*(*p*), and SU] resulted in very similar outcomes as the former quantities. Moreover, we only considered Strategy 4 for weighting of values since this was the most frequently used strategy for differential weighting. Based on these specifications, we narrowed our analysis to a total of 24 models. For each of these 24 models, we generated choice data for the same complex-gamble task performed by the subjects (with the same number of parameters, trials, etc.) and fit the various choice data using each of the 24 models.

For simulations presented in Figure 3, we generated choice data using different combinations of strategies for sorting and evaluating, and weighting strategies with different sets of parameters. More specifically, we used different combinations of β* _{b}*, β

*, and β*

_{m}*values: β*

_{w}*= , β*

_{b}*= , and β*

_{m}*= , corresponding to weighting Strategy 1; β*

_{w}*= , β*

_{b}*= , and β*

_{m}*= 0, corresponding to Strategy 2; β*

_{w}*= [0:0.25:1], β*

_{b}*= 1 − β*

_{m}*, and β*

_{b}*= 0, corresponding to Strategy 3; and finally β*

_{w}*= [0] and β*

_{b}*= [0.25:0.25:1], β*

_{w}*= [0.25] and β*

_{b}*= [0.25:0.25:0.75], β*

_{w}*= [0.5] and β*

_{b}*= [0.25:0.25:0.5], and β*

_{w}*= [0.75] and [for all of which β*

_{b}*= 1 − (β*

_{m}*+ β*

_{b}*)], corresponding to Strategy 4. Nonetheless, our method can correctly identify the weighting strategy, as evident in Figure 5 that shows only a small bias in estimation of β values for the most comprehensive weighting strategy (Strategy 4). For all these simulations, we used the median values of the estimated parameters from our subjects' probability weighting and utility functions (ρ = 0.63 and γ = 0.88).*

_{w}## Results

The experiment consisted of two tasks. In the first task, subjects selected between either a sure gamble and a simple gamble, or between a pair of simple gambles (simple-gamble task; Fig. 1*A*). In the second task, decisions were made between pairs of three-outcome gambles (complex-gamble task; Fig. 1*B*). In both tasks, gambles were presented as rectangular bars divided into different portions. A portion's color indicated the reward magnitude of that outcome, and its size signaled its probability (see Materials and Methods).

To examine whether subjects comprehended the objective of both tasks, we computed the probability of selecting the gamble with a larger EV on a given trial for each subject. During the simple-gamble task, subjects selected the gamble with a higher EV more often than chance (0.5), with a median equal to 0.79 across all subjects (two-sided sign test, *p* = 1.12 × 10^{−16}; *d* = 4.11). This tendency was weaker in the complex-gamble task, with the median equal to 0.57 (two-sided sign test, *p* = 6.17 × 10^{−4}; *d* = 0.68). Less frequent selection of gambles with a higher *EV* (i.e., more noisy behavior) was expected during the complex-gamble task since the pairs of presented gambles were closer in EV. However, on catch trials of the complex-gamble task (where one of the gambles was better than the other with respect to both reward magnitude and probability), subjects selected the better option more often than chance, with a probability larger than 0.7 and a median of 0.95 (two-sided sign test, *p* = 1.08 × 10^{−19}; *d* = 4.67). Together, these results illustrate that subjects understood the objective of both tasks and were motivated earn reward points.

#### Evaluation of simple gambles conformed to PT

We used various models to fit choice behavior to identify how individual subjects constructed the overall value of gambles in each task (see Materials and Methods). For the simple-gamble task, we assumed that subjects used one of the following four quantities to evaluate the only non-zero outcome of each gamble: EV; EV with the probability weighting function [*w*(*p*)]; EU; and SU.

Fitting of individual subjects' choice behavior showed that the model based on SU provided the best fit for the majority (83%) of subjects in the simple-gamble task (Fig. 2*A*). This indicates that subjects mainly used subjective utility to evaluate simple gambles. In addition, the estimated utility and probability weighting functions conformed to the predictions of PT. More specifically, the majority of subjects (56 of 64, corresponding to ∼88% of subjects) exhibited concave utility functions (Fig. 2*B*), and most subjects (43 of 64, corresponding to ∼67% of subjects) exhibited an inverse-S-shaped probability weighting function (Fig. 2*C*). Overall, these results demonstrate that PT can successfully account for choice between simple gambles.

#### Modeling evaluation and choice between complex gambles

We hypothesized that attention is involved in the evaluation and choice between gambles with multiple outcomes. Therefore, we developed a family of models that rely on the assumption that the values of individual gamble outcomes are differentially weighted via a simple attentional mechanism. More specifically, attention could be guided by different types of reward information (e.g., reward magnitude, expected value, etc.) to allow for differential weighting of possible outcomes, perhaps via gain modulation. This resulted in a large family of 240 unique models that differed in their assumptions about the strategies used for the evaluation of individual outcomes of a given gamble (eight possible quantities; Fig. 1*C*), for sorting and assigning different weights to individual outcomes (six possible quantities), and for combining outcome values to construct the overall value of a complex gamble (5 possible strategies; Fig. 1*D*; see Materials and Methods). For model comparison, we used the AIC and/or BIC to account for different numbers of parameters in different models.

Considering the complexity of the proposed models, we first tested whether our fitting procedure could identify the underlying mechanisms for the construction of overall reward value and estimate the associated parameters correctly. More specifically, we simulated choice data using the proposed models over a wide range of model parameters for the same complex-gamble task performed by the subjects (see Materials and Methods). We subsequently fit the simulated data with each of these models to compute the goodness-of-fit (in terms of AIC) and to estimate the original model parameters used to simulate the data.

We found that for most cases, the model used to generate the data provided the best fit, indicating that the fitting could be used to identify the mechanism underlying valuation (Fig. 3). We also found that the fitting procedure could not perfectly distinguish between certain strategies used for the evaluation of individual outcomes: *m* and *u*(*m*); *p* and *w*(*p*); and EV, EV with *w*(*p*), EU, and SU. As a result, we took caution when interpreting the fitting results of subjects' actual data with regard to the exact strategies used to evaluate individual outcomes.

### Figure 3-1

**Fitting method was able to correctly identify the type of strategy used for evaluating individual gamble outcomes in majority of cases.**(A) Plot shows the average goodness-of-fit (in terms of the average normalized AIC over a set parameters) for fitting choice data generated with a model with a given type of evaluation strategy (single-attribute vs. combined-attribute) and fit with a model using each of the two types of strategies when outcomes were sorted based on reward magnitude. The single-attribute evaluation corresponds to evaluation based on

*m, u*(

*m*),

*p,*and

*w*(

*p*), whereas combined-attribute evaluation corresponds to evaluation based on

*EV, EV+w*(

*p*)

*, EU,*and

*SU*. The types of strategy used to generate and fit the data are indicated on the x- and y-axis, respectively, and an average is taken over all models with either single- or combined- attribute evaluation strategies in each quadrant of Figure 3. (B-C) The same as in A but for sorting of based on probability (B) or

*EV*(C). (D) Plot shows the distribution of the differences in normalized AIC between the best single-attribute and combined attribute models when sorting based on magnitude. The data was generated using single- or combined-attribute evaluation as indicated in the legend. Dashed lines indicate the medians. (E-F) The same as in D but for sorting based on probability (E) or

*EV*(F). (G) Plot reports the percentage of models with certain types of evaluation strategies that best fit data generated with a given type of evaluation strategy and magnitude sorting, across a large set of model parameters. The values on diagonal show the percentage of correct identification based on data generated using single- or combined-attribute evaluation. The best model was determined based on the minimum AIC value for a given set of data. (H-I) The same as in G but for sorting based on probability (H) or

*EV*(I). Overall, the error in identification of the type of evaluation model was relatively small (< 15%). Download Figure 3-1, TIF file

### Figure 3-2

**Small error in estimating model parameters based on the model used to generate the choice data.**Each column corresponds to a different quantity used to evaluate individual gamble outcomes, and each row corresponds to a different quantity used for sorting outcomes (magnitude, probability, and

*EV*as indicated on the right), which were used to assign different weights to each outcome. Overall, there were small error and negligible bias in the estimated parameters. Download Figure 3-2, TIF file

In addition, we examined the overall accuracy of our method in identification of the strategy used for evaluating individual outcomes. First, we fit choice data generated by a given model with each of the 24 possible models and computed the “average” AIC across all the models with single-attribute evaluation and those with combined-attribute evaluation. This was done separately for sorting based on magnitude, probability, or EV. We found that the average AIC across all types of sorting was smaller for the correct model (Fig. 3-1*A–C*, diagonals have smaller AIC values than off-diagonals). However, the average AIC of the combined-attribute models was only slightly worse than that of the single-attribute models for fitting data generated with single-attribute strategies when sorting based on magnitude (Fig. 3-1*A*). This finding was not observed when sorting based on probability or EV (Fig. 3-1*B*,*C*).

Average AIC values, however, are not very informative about correct model identification since the best model is determined based on the minimum AIC and not the average AIC. Therefore, for a given set of data, we also computed the difference between normalized AIC for the best models (models with minimum AIC) based on single- and combined-attribute evaluation strategies. We found significant differences between the AIC values supporting the correct model (Fig. 3-1*D–F*). Although smaller for data generated with single-attribute evaluation strategies and magnitude sorting, differences in AIC for the best models allowed the correct model to be identified in majority of cases. More specifically, only in 3–15% of the instances was the type of evaluation strategy incorrectly identified, such as when using data generated with single-attribute evaluation and sorted based on magnitude (Fig. 3-1*G–I*). Overall, these analyses demonstrate that the error in identification of the type of evaluation strategy (single- vs combined-attribute evaluation) is relatively small.

Finally, we validated our fitting method by comparing estimated and actual model parameters. We found the overall relative differences between the actual parameters and the parameters estimated by the same model used for generating the data to be very small, which indicates little to no systematic biases in our analysis (Fig. 3-2). Together, these simulation results support the feasibility of our fitting approach for identifying mechanisms used for the construction of overall reward value in the real data.

#### Subjects used different quantities to evaluate and sort the outcomes of complex gambles

We next fit individual subjects' data with our models to identify the strategies used for sorting, evaluating, and weighting of gamble outcomes. We computed and compared the goodness-of-fit across all models based on all combinations of strategies to find the best overall model for each subject. We also examined the quality of model selection using Vuong's test.

We first examined the quantity used by subjects to sort the possible outcomes of complex gambles into best, middle, and worst outcomes to differentially weigh them. We found that 61% of subjects sorted gamble outcomes based on outcomes' reward magnitudes or probabilities (Fig. 4*A*). This percentage was larger than the percentage of subjects who used a combination of reward information (EV, EV with *w*(*p*), EU, or SU) for sorting outcomes (χ^{2}_{(1)} = 5.28, *p* = 0.012).

### Figure 4-1

**Models based on single-attribute sorting and combined-attribute outcome evaluation can better predict choice behavior.**Plots show mean and standard error of AIC values over all subjects for all the models with different quantities for sorting (A), evaluating individual outcomes (B), and for different types of weighting strategies (C). We also compared the mean AICs across all models with single-attribute vs. combined-attribute sorting (A), or single-attribute vs. combined-attribute evaluation (B). The asterisk indicates that the median for the two types of models were statistically different from each other (two-sided sign-test,

*P*< 0.05). To determine the significance, we calculated the difference between AIC values of all the models with single-attribute sorting (A) or evaluation (B) versus all the models with combined-attribute sorting or evaluation. Download Figure 4-1, TIF file

### Figure 4-2

**Most subjects sorted and weighted possible outcomes based on reward magnitude or probability (single-attribute sorting) but evaluated each gamble outcome based on a combination of reward attributes (combined-attribute evaluation).**(A) Plot shows the fraction of subjects whose choice was best fit by a given quantity for sorting and the sum over all single-attribute vs. combined-attribute sorting, using BIC as the goodness-of-fit. Overall, most subjects used a single reward attribute for sorting gamble outcomes; reward magnitude was the most used quantity (37.5%) followed by reward probability (25%). Note that identical results were obtained from sorting outcomes based on

*m*and

*u*(

*m*), and for

*p*and

*w*(

*p*). (B) Plot shows the fraction of subjects whose choice was best fit by a given strategy to evaluate individual gamble outcomes. Overall, most subjects (89%) used a combination of reward attributes for evaluating individual gamble outcomes,

*EU*being the most used quantity (39%). (C) Plot shows the fraction of subjects whose choice was best fit by a given weighting strategy. Weighting Strategy 4 provided the best fit for the majority (41 out of 64, corresponding to 64%) of subjects. (D–F) Plots are similar to panels A–C but show the results for subset of subjects for whom the fit based on their best model was significantly different from other models using Vuong’s test (

*N*= 45). 35.5% of subjects sorted gamble outcomes based on reward magnitude, whereas 29% used reward probability for sorting (D). 40% of subjects used

*EU*to evaluate individual outcomes in complex gambles (E). Strategy 4 provided the best fit for the majority (26 out of 45, corresponding to 58%) of subjects (F). Download Figure 4-2, TIF file

### Figure 4-3

**Models based on single-attribute sorting and combined-attribute outcome evaluation can better predict choice behavior.**Plots show mean and standard error of BIC values over all subjects for all the models with different quantities for sorting (A), evaluating individual outcomes (B), and for different types of weighting strategies (C). We also compared the mean BICs across all models with single-attribute versus combined-attribute sorting (A), or single-attribute versus combined-attribute evaluation (B). The asterisk indicates that the median for the two types of models were statistically different from each other (two-sided sign-test,

*P*< 0.05). To determine the significance, we calculated the difference between BIC values of all the models with single-attribute sorting (A) or evaluation (B) versus all the models with combined-attribute sorting or evaluation. Download Figure 4-3, TIF file

We then compared quantity or strategy used by each subject to evaluate individual gamble outcomes. We computed the percentage of subjects who used a given quantity to evaluate individual outcomes, as determined by the model that provided the best fit in terms of the AIC (or BIC, see the last paragraph of this section). We found that 41, 25, 16, and 12% of subjects evaluated individual outcomes based on EU, SU, EV, and EV with *w*(*p*), respectively (Fig. 4*B*). These results demonstrate that most subjects (60 of 64, corresponding to 94% of subjects) combined information about reward probability and magnitude to evaluate individual outcomes of complex gambles.

Therefore, although 94% of subjects combined reward information to evaluate individual gamble outcomes, most subjects used a single piece of reward information when sorting outcomes for weighting. We found similar dissociation between the strategies used for sorting and evaluating individual outcomes based on the mean AIC across all subjects instead of the best model for individual subjects (Fig. 4-1). This stark difference demonstrates that separate mechanisms were involved in evaluating and combining the values of individual outcomes to construct an overall value for complex gambles.

As mentioned earlier, our method could misidentify single-attribute evaluation strategies with combined-attribute ones in <∼15% of model instances when sorting based on magnitude. Considering this relatively small error rate and the fact that we ensured that goodness-of-fit for the best model is significantly better than the second best model (using Vuong's test), we estimate that model misidentification could potentially result in mislabeling only a few subjects (<9), which does not change our main conclusion. However, the fitting procedure could misidentify the exact evaluation strategy among combined-attribute strategies (i.e., EV, EV with *w*(*p*), EU, and SU), and thus, we used caution in interpreting the exact strategy used for combined-attribute evaluation.

To determine how subjects combined outcome values to form the overall value of a gamble (or equivalently, to directly compare gambles), we then compared five alternative models of outcome weighting (Fig. 1*D*). This analysis revealed that weighting Strategy 4, which assigned different weights to the three possible outcomes, provided the best fit for a majority (80%) of the subjects (Fig. 4*C*). Moreover, choice behavior of only 5 of 64 subjects (8%) were best fit by Strategies 2 and 3 (which ignored the worst outcome), indicating that the majority of subjects (92%) considered the values of all three possible gamble outcomes when making decisions.

To examine whether the best model for each subject is significantly better than the rest of the models, we performed Vuong's test (Vuong, 1989; see Materials and Methods). We found that 19 subjects (∼30% of subject) showed no significant difference between the best model (model with minimum LL value) and the second best model. However, for the majority of subjects (45 subjects equal to 70% of subjects), we found the same pattern of results as in our original analysis using all data (Fig. 4*D–F*).

Finally, to further illustrate that our results do not depend on the specific metric for the goodness-of-fit, we repeated our model selection and strategy identification based on the BIC. As shown in Fig. 4-2 and Fig. 4-3, we found qualitatively similar results using BIC as the goodness-of-fit and our main conclusions about the strategies used for sorting, evaluating, and weighting of individual outcomes still hold.

#### Differential weighting could not be captured by PT

Considering the complexity of models used for fitting, we also examined whether the proposed differential weighting is necessary to capture choice behavior (i.e., our results are not affected by overfitting) beyond what can be explained by changes in the utility and probability weighting functions in PT. In addition, we also tested whether our fitting approach could identify the underlying model parameters without any systematic bias in the presence and absence of differential weighting (an extension of the analyses presented in Fig. 3).

To that end, we generated choice data based on three models that evaluated individual gamble outcomes using subjective value (the most common quantity used for evaluation among our subjects), sorted these outcome values based on magnitude, probability, or expected value, and then combined these outcome values based on Strategy 4 (the most common weighting strategy used among our subjects) using a wide range of weighting parameters similar to those used for simulations presented in Figure 3. We then fit these simulated data with each of the models used to generate the data as well as with the model without differential weighting.

We found that the model with differential weighting was able to fit simulated choice data very well and captured the original model parameters without any bias and with small error in most cases (Fig. 5 and Fig. 5-1, Fig. 5-2, Fig. 5-3). In contrast, the model without differential weighting was not able to fit the simulated data well and provided systematically biased estimates of risk-preference parameters. More specifically, the estimated ρ and γ values based on the model without differential weighting were smaller than the actual values (Fig. 5 and Fig. 5-2). This indicates that some of the previously observed concavity of the utility function and curvature of the inverse-S-shaped probability weighting function could be due to differential weighting mediated via attentional mechanisms. Together, these results not only validate our approach but also illustrate that prospect theory cannot be used to fit data for which value construction is influenced by differential weighting.

### Figure 5-1

**The model without differential weighting cannot fit choice data generated with differential weighting and nonlinear utility and probability weighting functions.**(A) Plot shows the average goodness-of-fit (in terms of negative log likelihood; a smaller number corresponds to a better fit) for fitting choice data generated by models with different sets of weights, as indicated by distance from each corner of the ternary plot, and with sorting based on reward magnitude. For these simulations, the values of individual gamble outcomes were computed using a concave utility function and an inverse-S-shaped probability weighting function based on the average risk-preference in our subjects (ρ = 0.6 and γ = 0.9). The inset shows the distribution of average negative log likelihood across all models. (B-C) The same as in panel A but for sorting based on reward probability and

*EV*, respectively. (D-F) The same as in panels A-C but when the choice data are fit with the model without differential weighting. Overall, the model without differential weighting failed to fit data for most weight values. Download Figure 5-1, TIF file

### Figure 5-2

**The model with differential weighting can estimate original parameters without a significant bias, whereas the model without differential weighting systematically underestimates ρ and γ**. (A) Plots show the distribution of the relative error in the estimation of weights assigned to the best, medium, and worst outcomes (from left to right). For these simulations, the values of individual gambles were computed using linear utility and probability weighting functions (ρ = 1.0 and γ = 1.0), and outcomes were sorted based on reward magnitude. (B) The distribution of estimated values for ρ and γ across all models. Overall, the estimated parameters were centered around the actual values. (C) The distribution of estimated values for ρ and γ when choice data are fit with a model without differential weighting. The estimated parameters were significantly smaller than the actual values. (D-F) The same as in panels A-C but for sorting based on reward probability. (G-I) The same in panels A-C but for sorting based on

*EV*. Download Figure 5-2, TIF file

### Figure 5-3

**The model without differential weighting cannot fit choice data generated with differential weighting and linear utility and probability weighting functions.**(A) Plot shows the average goodness-of-fit (in terms of negative log likelihood; a smaller number corresponds to a better fit) for fitting choice data generated by models with different sets of weights, as indicated by distance from each corner of the ternary plot, and with sorting based on reward magnitude. For these simulations, the values of individual gamble outcomes were computed using linear utility and probability weighting functions (ρ = 1.0 and γ = 1.0). The inset shows the distribution of average negative log likelihood across all models. (B-C) The same as in panel A but for sorting based on reward probability and

*EV*, respectively. (D-F) The same as in panels A-C but when the choice data are fit with the model without differential weighting. Overall, the model without differential weighting failed to fit the data for most weight values. Download Figure 5-3, TIF file

#### Our model accounts for choice behavior better than competing models

To compare our model and competing models for valuation of complex gambles, we used four rank-dependent models to fit our experimental data. This includes CPT, TAX, STC, and DFT (see Materials and Methods).

We found that our model can better predict choice behavior for the majority (55%) of subjects (Fig. 6*A*). We also compared the ability of our model versus each of the competing models and found that the best competing model (STC) could fit data better for only one-third of subjects (percentage of subjects that were better fit with a competing model: CPT = 14%; TAX = 20%; DFT = 12%; STC = 32.8%; Fig. 6*B*). Overall, these results demonstrate that our plausible yet simpler model can outperform more complex models in capturing choice behavior. Importantly, our model distinctly allows for the evaluation of individual outcomes and their combination to rely on different pieces of information (e.g., evaluation based on expected value but differential weighting based on magnitude). Therefore, the superiority of our model in capturing individual subjects' choice behavior could indicate that individual variability in evaluating complex gambles could arise from differences in the type of reward information that guides attention.

#### Subjects assigned larger weights to the best and worst outcomes in terms of reward magnitude and probability

Having established that most subjects used differential weighting to combine reward values of possible outcomes, we then examined the weights assigned to the three outcomes within a gamble. For subjects who sorted outcomes based on a single reward attribute, we observed significant differences in weight assignments (Fig. 7*A–D*). More specifically, the weight assigned to the best (largest magnitude or probability) outcome was significantly greater than that of the middle outcome [β_{b} − β_{m} = 0.13 ± 0.27 (mean ± SD), two-sided sign test, *p* = 0.02, *d* = 0.44; Fig. 7*C*]. The worst outcome also had a significantly greater weight compared with the middle outcome [β_{w} − β_{m} = 0.09 ± 0.37 (mean ± SD), two-sided sign test, *p* = 0.0005, *d* = 0.23; Fig. 7*D*]. There was no significant difference between the weights for the best and worst outcomes (two-sided sign test, *p* = 0.61, *d* = 0.11).

### Figure 7-1

**Distributions of the coefficient of variation (CV) for estimated model parameters**. Plots show the distributions of CV for estimated weighting parameters of the best (

**A**), middle (

**B**), and worst (

**C**) outcomes across all subjects. CV for each subject was calculated by first fitting the best model based on randomly sampled 90% of data and computing the mean and standard deviation of the estimated parameter distributions (CV is equal to the standard deviation divided by the mean). Blue dashed line shows the median of each distribution. Solid black line shows the median of the average CV value for all estimated model parameters in the simple-gamble task as a baseline for variability in parameter estimation. Download Figure 7-1, TIF file

### Figure 7-2

**No evidence for differential weighting caused by certain color-reward assignments.**Plots show the distribution of relative weight differences separately for each group of subjects with a specific color-reward assignment. Each inset indicates the color-reward assignment specific to the corresponding group; colors associated with the largest, middle, and smallest reward magnitudes are shown from top to bottom. The blue dashed lines indicate the medians. There were no significant weight differences between the best and middle outcomes (A) or the worst and middle outcomes (B) in any of the groups (two-sided sign-test;

*P*> 0.05). In addition, the differences in weights were not significantly different between any pair of groups (two-sided ranksum test;

*P*> 0.05). Download Figure 7-2, TIF file

These results illustrate that the most important outcomes (best and worst based on a given subject's sorting) were assigned larger weights for the construction of overall reward value. In contrast, for subjects who sorted outcomes based on a combination of reward information (EV, EV with *w*(*p*), EU, or SU), there were no significant weight differences between the best and middle outcomes [β_{b} − β_{m} = 0.06 ± 0.30 (mean ± SD), two-sided sign test, *p* = 0.7, *d* = 0.17; Fig. 7*E–G*] or between the worst and middle outcomes [β_{w} − β_{m} = −0.08 ± 0.45 (mean ± SD), two-sided sign test, *p* = 0.98, *d* = 0.19; Fig. 7*H*]. These results indicate that differential weighting of gamble outcomes was consistent mainly among subjects who sorted outcomes based on a single piece of reward information.

To address the robustness and consistency of our method and estimated model parameters, we also measured the variability in the estimation of model parameters. More specifically, we used the best model for a given subject to fit 90% of choice data (randomly sampled) from that subject and estimated all model parameters. We then repeated this procedure 100 times to calculate the distributions of estimated parameters for each subject. Using the distribution of each model parameter, we then computed the coefficient of variation (CV; equal to the SD of the distribution divided by its mean), as a standardized measure of dispersion in the estimated model parameters. Small values of CV indicate consistency or robustness of estimated parameters. To provide a baseline, we also computed CV for model parameters in the simple-gamble task using the same procedure described above.

Overall, we found relatively small CV values for the estimated weighting parameters of all reward outcomes (median CV = 0.21, 0.17, and 0.25 for the best, middle, and worst outcomes, respectively; Fig. 7-1). As a baseline for comparison, the median of CV of the estimated model parameters (ρ, γ and σ) across subjects in the simple-gamble task was 0.16. These results illustrate that our method can estimate model parameters consistently, and thus, our method can be used to make reliable inferences about differential weighting of possible outcomes.

In our study, reward magnitude was represented by specific combinations of colors for different subjects. Therefore, we also examined that our observations were not driven by certain combinations of color–reward assignments (e.g., red for the largest reward could be more effective than green); that is, there was no systematic color bias. We categorized subjects into six possible groups based on their color–reward assignments and examined differential weighting for each group (Fig. 7-2). This analysis did not reveal any evidence for systematic differences between the groups, indicating that differential weighting of possible outcomes was unlikely because of specific color–reward assignments.

#### Larger weighting of the best and worst outcomes was not driven by information seeking

In designing complex gambles, we only ensured that the subjective utilities of each gamble pairs were close in value for each subject. Therefore, it is possible that the distribution of the probability of middle outcomes across all trials was less variable (or disperse) than those of the best and worst outcomes. A larger dispersion of the distributions for the best and worst outcome probabilities may result in these outcomes to be perceived as more informative and thus, weighted more strongly. To exclude such an explanation for our observation, we computed the distributions of reward probability for the best, middle, and worst outcomes (when sorting based on magnitude) for each subject and calculated SD (as a measure of dispersion) of these distributions across all subjects.

As shown in Figure 8, we did not find any significant difference between the medians of standard deviations of outcome probabilities for the best, medium, and worst outcomes. Therefore, we did not find any evidence for the dispersion of outcome probabilities (and thus the informativeness of outcomes) to underlie the observed overweighting of the best and worst outcomes.

#### Differential weighting could enable subjects to more easily and quickly choose between complex gambles

To address possible advantages of differential weighting, we examined the influence of this mechanism on overall risk preference and whether it allowed subjects to make decisions more easily. Critically, we designed the complex-gamble task such that the pair of gambles presented on each trial have similar SU (using a wide range of reward probabilities) to detect additional mechanisms involved in the construction of reward value (see Materials and Methods). This feature also made decision making more difficult in the complex-gamble task.

To quantify the influence of differential weighting on overall risk preference, we computed the relative change in the value of each gamble after the inclusion of differential weighting based on estimated weights for each subject (for all gambles used in the complex-gamble task). We found that when subjects used reward magnitude for sorting, assigning a larger weight to the best outcome (i.e., outcome with the largest magnitude) resulted in an increase in the value of gambles, thereby increasing risk-seeking behavior (Fig. 9*A*). Similar but weaker changes in subjective values were observed in the case of sorting based on reward probability (Fig. 9*B*).

### Figure 9-1

**Changes in risk preference due to differential weighting.**(

**A**) Plots show the change in the subjective value of a three-outcome gamble after including differential weighting, with a given set of weights indicated by their distances from each corner of the ternary plot, based on magnitude sorting. The two values on each corner are the reward magnitude and probability associated with the best (lower left corner), medium (upper corner), and worst (lower right corner) outcomes, as determined by reward magnitude. The subjective value of the gamble increased when the largest weight was assigned to the outcome with the largest expected value (i.e. the product of the reward magnitude and probability on a given corner). (

**B-D**) The same as in panel A but with different sets of reward probabilities associated with the three values of reward magnitude. Download Figure 9-1, TIF file

To better understand how risk preference is influenced by differential weighting, we also computed the relative change in the value of each gamble after the inclusion of differential weighting using a wide range of weights and reward probabilities when sorting based on magnitude (Fig. 9-1). We found that the change in overall value depended on which outcome was assigned the largest weight. More specifically, the overall value of a three-outcome gamble always increased if the outcome with the highest expected value (or subjective utility) was assigned the largest weight, resulting in more risk-seeking behavior. In contrast, the overall value decreased if the outcome with the lowest expected value was assigned the largest weight, resulting in more risk-aversive behavior.

To quantify how easily a given subject could distinguish between pairs of gambles, we defined discriminability based on the subjective values of gambles estimated for that subject (see Materials and Methods). We computed and compared discriminability for each subject in the complex-gamble task as well as in the simple-gamble task using the best subject-specific models with and without differential weighting, respectively. We found the effect of differential weighting on discriminability to be more complex than that on the overall gamble value, but overall, differential weighting increased discriminability for most weight values (Fig. 9*D*,*E*). As expected, discriminability was smaller across all subjects in the complex-gamble task compared with the simple-gamble task because of differences in task design (Fig. 9*C*). The inclusion of differential weighting, however, significantly increased discriminability across all subjects during the complex-gamble task (two-sided sign test, *p* = 1.4 × 10^{−10}, *d* = 0.51). In other words, subjects who used differential weighting could more easily discriminate and thus, choose between gambles.

To test whether this increase in discriminability also allowed for faster decision making, we examined the correlation between the change in the average reaction time between the simple- and complex-gamble tasks and the corresponding change in discriminability because of differential weighting within individual subjects. We found a significant negative correlation between the change in reaction time and the change in discriminability for subjects who sorted outcomes based on reward probability or magnitude (Pearson correlation: *r* = −0.32, *p* = 0.035; Spearman correlation, *r* = −0.39, *p* = 0.017; Fig. 9*F*). A similar result was found when considering all subjects (Pearson correlation: *r* = −0.34, *p* = 0.006; Spearman correlation, *r* = −0.38, *p* = 0.002). This indicates that subjects became relatively faster in the complex-gamble task depending on the extent to which they used differential weighting for discrimination between gambles. Together, these results suggest that differential weighting enabled subjects to more easily and quickly select between three-outcome gambles.

## Discussion

By comparing choice between simple and three-outcome gambles within subjects, we examined how valuation and decision-making between more complex gambles are influenced by attentional mechanisms. We found that choice between simple gambles was consistent with prospect theory since evaluation based on subjective utility provided the best fit for our data. When evaluating three-outcome gambles, subjects also combined reward probability and magnitude to assign a value to each gamble outcome, but at the same time, most subjects differentially weighted possible outcomes based on either reward magnitude or probability. These results point to a novel dissociation between how reward information of complex gambles is processed: valuation of each outcome is based on a combination of reward information, whereas weighting of possible outcomes mainly relies on a single piece of reward information. This flexible weighting of possible outcomes, in turn, allowed for a more dynamic construction of reward value and enabled easier and faster decision making, especially for difficult choices between options with similar objective or subjective values. Together, our study reveals a plausible, salience-driven mechanism underlying valuation of complex gambles.

Currently, there are a number of sophisticated models for valuation and choice between complex gambles. Most of these models rely on a rank-dependent mechanism for processing alternative outcomes. Although our model does not use cumulative weighting function, its weighting mechanism makes it resemble the STC model. Unlike our model, however, STC and other competing models examined here require complex computations, and it is unclear how these computations can be instantiated in the brain. In addition, none of these competing models have been used to fit choice data from individual subjects, and thus, there is no evidence that they can capture individual variability. Although choice behavior of some subjects is better captured by more complex models, our heuristic model provides a better and more plausible fit for the majority of subjects.

Given our limited processing resources, exhaustively weighting and summating all possible outcomes to evaluate an option is not feasible unless we can simplify valuation and decision-making processes using some form of heuristics (Gigerenzer and Goldstein, 1996; Brandstätter et al., 2006; Gigerenzer and Gaissmaier, 2011). In many cases, information must somehow be prioritized to avoid cognitive overload. Such prioritization has been assumed to be performed mainly via attentional mechanisms (Treisman and Gelade, 1980; Klein, 1988; McLeod et al., 1988; Wolfe and Horowitz, 2004; Watson and Kunar, 2010; Russell and Kunar, 2012). It has been shown that both bottom-up and top-down attention can influence processing and integration of reward information and ultimately choice behavior (Krajbich et al., 2010; Tsetsos et al., 2012; Kunar et al., 2017). By selectively processing certain outcomes within complex risky options, the decision maker can reduce computational demands required for evaluation of such options.

In the case of multi-attribute options, decision making can be very difficult because it requires weighting the pros and cons of options that differ across multiple, sometimes incommensurate, dimensions (Fellows, 2006). Therefore, various heuristics have been proposed for reducing the computational demands and complexity of valuation and decision processes in such cases (even in the absence of risk). This includes differential weighting of different dimensions, limiting the amount of information, and reducing the number of alternatives to be considered (Payne et al., 1993; Gigerenzer and Goldstein, 1996; Gigerenzer and Todd, 1999). Although we assumed that each gamble is assigned an overall subjective value, decision making in our model can also be interpreted as weighted comparisons across individual outcomes based on reward probability or magnitude; that is, the subjects could directly choose between gambles by comparing values for similar individual outcomes and combine such comparisons across all possible outcomes (Tversky, 1969, 1972). This further illustrates how attention can modulate choice between risky options by differentially weighting various comparisons across possible attributes or outcomes.

Critically, we observed a dissociation between what drives the evaluation of individual outcomes (i.e., a combination of reward information) and what drives the weighting process (i.e., a single piece of reward information), which indicates that selective processing of reward information may not rely on a combination of probability and magnitude. The sorting of outcomes based on a single reward attribute can be seen as a more general case of the elimination by aspect theory (Tversky, 1972). One possible mechanism for such processing is selective attention (Busemeyer and Townsend, 1993; Roe et al., 2001; Shimojo et al., 2003; Hayden et al., 2008; Ludvig et al., 2014). In our task, the most relevant form of attention is feature-based attention, which could selectively enhance the representation of certain visual attributes (e.g., color or size) at the expense of the others (Carrasco, 2011). Given the visual presentation of gambles in our experiment, subjects could attend to certain “learned” reward features (color and size) within the gamble and thus, weigh outcomes by their reward salience. Interestingly, we found that subjects assigned larger weights to both the best and worst outcomes, indicating that extreme outcomes were most salient. A plausible neural mechanism for this differential weighting could be biased competition associated with feature-based attention (Reynolds et al., 1999; Kastner and Ungerleider, 2001; Beck and Kastner, 2005) or competition at multiple levels of value representation (Jocham et al., 2012; Hunt et al., 2014). The increased weighting of the best outcome in terms of magnitude is consistent with observed increases in attention toward larger rewards (Della Libera and Chelazzi, 2006; Raymond and O'Brien, 2009) and results in risk-seeking behavior. In contrast, larger weighting of the worst outcome can contribute to risk-aversion. Together, our results suggest that attentional processes could contribute to differential weighting of reward outcomes by their salience, which simplifies valuation and ultimately results in flexible adjustments of risk attitudes.

The observed dissociation between the types of reward information used for evaluation of individual outcomes and for weighting of alternative outcomes suggests separate mechanisms through which reward influences choice and the selective processing of information (Soltani et al., 2016; Rakhshan et al., 2018) and highlights the importance of attention for adaptive choice under risk at the expense of optimality (Farashahi et al., 2017a,b). More specifically, although each possible outcome should be evaluated based on a strategy that combines different pieces of reward information (and thus could be optimal), differential weighting based on a single quantity (e.g., reward magnitude or probability) can enable flexibility depending on the state of the decision maker. For example, when hungry, the decision maker could attend more to cues that represent the amount of food, or reward. This could result in more risk-seeking behavior but also faster and easier decision making, both of which are crucial for survival. Finally, the success of our model in capturing individual subjects' choice behavior indicates that individual variability in evaluating complex gambles could arise from differences in the type of reward information that guides attention.

Therefore, differential weighting of outcomes provides a plausible mechanism for flexible risk attitudes (Huber et al., 1982; Lattimore et al., 1992; Hey and Orme, 1994; Stewart et al., 2003; Bruhin et al., 2010; Ludvig et al., 2013; Rigoli et al., 2016; Fujimoto and Takahashi, 2016) and results in a tradeoff between optimality and flexibility. We have recently shown that a simpler version of our model can account for monkey's choice behavior during choice between simple gambles, providing further evidence for differential weighting (Farashahi et al., 2018). We speculate that observed deviations from normative theories of choice could be due to such weighting mechanisms, reflecting the flexibility required for decision making in dynamic environments (Payne et al., 1988). Together, our results shed light on possible neural mechanisms of choice under risk in naturalistic settings, and moreover, highlight the role of attention in the flexible construction of reward value for complex gambles.

We assert that although our study does not include neuronal measurements, our extensive model fitting and data analyses provide a simple but plausible mechanism for how value of complex gambles are constructed, or equivalently, how these gambles are compared. The main mechanism proposed here (i.e., differential weighting of individual gamble outcomes) is simple enough that can be easily implemented via attentional mechanisms. Specifically, attention can be guided to certain gamble outcomes based on a single attribute and subsequently change (perhaps via gain modulation) the influence of those outcomes on the overall value or choice between a pair of gambles. Our novel finding about the dissociation between how reward information is processed when evaluating complex gambles can be tested in future experiments using neuronal recording.

## Footnotes

This work was supported by a NSF Grant (EPSCoR Award 1632738) to A.S. We thank Daeyeol Lee for helpful comments on the paper.

The authors declare no competing financial interests.

- Correspondence should be addressed to Alireza Soltani at soltani{at}dartmouth.edu.