Abstract
The Iowa gambling task (IGT) is one of the most influential behavioral paradigms in reward-related decision making and has been, most notably, associated with ventromedial prefrontal cortex function. However, performance in the IGT relies on a complex set of cognitive subprocesses, in particular integrating information about the outcome of choices into a continuously updated decision strategy under ambiguous conditions. The complexity of the task has made it difficult for neuroimaging studies to disentangle the underlying neurocognitive processes. In this study, we used functional magnetic resonance imaging in combination with a novel adaptation of the task, which allowed us to examine separately activation associated with the moment of decision or the evaluation of decision outcomes. Importantly, using whole-brain regression analyses with individual performance, in combination with the choice/outcome history of individual subjects, we aimed to identify the neural overlap between areas that are involved in the evaluation of outcomes and in the progressive discrimination of the relative value of available choice options, thus mapping the two fundamental cognitive processes that lead to adaptive decision making. We show that activation in right ventromedial and dorsolateral prefrontal cortex was predictive of adaptive performance, in both discriminating disadvantageous from advantageous decisions and confirming negative decision outcomes. We propose that these two prefrontal areas mediate shifting away from disadvantageous choices through their sensitivity to accumulating negative outcomes. These findings provide functional evidence of the underlying processes by which these prefrontal subregions drive adaptive choice in the task, namely through contingency-sensitive outcome evaluation.
Introduction
Decision making—an elaborate, non-unitary process—integrates cognitive and affective processing of action consequences driving ongoing behavior. One of the most influential decision-making paradigms, the Iowa gambling task (IGT) (Bechara et al., 1994), attributes reward and punishment to decisions made under ambiguity, mimicking real-life decision making (Bechara et al., 2000a).
Participants choose from four card decks, aiming to maximize profit on facsimile monetary rewards. Unbeknown to them, two decks lead to higher immediate wins but long-term loss (“disadvantageous”), and two lead to lower immediate wins but long-term gains (“advantageous”). Successful performance then relies on multiple cognitive operations, such as memory for the relative contingencies as they unravel with repeated sampling, integrating affective information into future strategy, and inhibiting alluring but risky choices; this makes the IGT difficult to disentangle neurophysiologically (Brand et al., 2006; Dunn et al., 2006).
The IGT was originally developed to formalize decision-making impairments in ventromedial prefrontal cortex (vmPFC) lesion patients, who prefer immediate over long-term gains, a pattern coined “temporal myopia” (Bechara et al., 1994). Consequently, it has been used widely in impulsivity disorders (Brand et al., 2006). Lesion and functional magnetic resonance imaging (fMRI) studies implicate the vmPFC (Bechara et al., 1998; Fukui et al., 2005; Lawrence et al., 2009), dorsolateral prefrontal cortex (dlPFC) (Manes et al., 2002; Clark et al., 2003), insula, anterior cingulate (ACC), presupplementary motor area (pre-SMA), and amygdala (Ernst et al., 2002; Bolla et al., 2003, 2005) in task performance.
A problem of the dominant IGT analysis, comparing advantageous with disadvantageous choices, is that these labels do not necessarily reflect participants' experience during decision making (Maia and McClelland, 2004). In fMRI, contrasting advantageous and disadvantageous trials cannot probe networks dynamically engaged as participants sample the task contingencies. Rather, analysis should incorporate participants' outcome history, as is typical in reward-learning studies (Busemeyer and Stout, 2002; Dunn et al., 2006). This shifts the analytical focus from the moment of decision to the evaluation of outcomes.
Reward and punishment are processed by distinct networks (Yacubian et al., 2006; Wrase et al., 2007), involving medial PFC and striatum in reward (Knutson et al., 2003; Haruno et al., 2004; O'Doherty et al., 2004), and insula, ACC, and lateral PFC in punishment (O'Doherty et al., 2003; Gottfried and Dolan, 2004; Frank et al., 2005; Liu et al., 2007). However, in the IGT, outcomes immediately follow choices, and contrasting trial types necessarily comeasures decision making and outcome evaluation. Consequently, little is known about reinforcement valence processing (Lawrence et al., 2009), a major shortfall given valence-related performance effects in healthy (Wood et al., 2005; Wheeler and Fellows, 2008) and patient (Dalgleish et al., 2004) populations. Valence processing depends on reinforcement context (Nieuwenhuis et al., 2005). Regions sensitive to context (“framing”) include ventral PFC, insula, amygdala, and ACC (Elliott et al., 2000; Baxter and Murray, 2002; Gottfried et al., 2003; Pickens et al., 2003; Rogers et al., 2004; Taylor et al., 2006). In a previous fMRI IGT study, ventral PFC activation was driven primarily by outcome consistency, as well as valence (Windmann et al., 2006).
In this study, we temporally dissociated choices and outcomes modifying a recent fMRI IGT adaptation (Lawrence et al., 2009), thus independently examining activation during decisions and outcomes. Importantly, we reasoned that brain regions in which activity discriminates disadvantageous from advantageous alternatives in a manner predictive of adaptive performance must have access to the task reinforcement contingencies. This predicts the involvement of the same regions in outcome evaluation, in a manner sensitive to previously formed expectations. We propose that these regions subserve the integration of trial-by-trial experience into adaptive decision making.
Materials and Methods
Subjects
Nineteen healthy adult males participated in the experiment (mean ± SD age, 24.20 ± 3.93 years; range, 18.21–31.28 years). Given previous evidence for significant sex differences in performance and brain activation during the IGT (Bolla et al., 2004; Tranel et al., 2005), only males were included in the study to increase homogeneity of the results. All participants were right handed, as assessed using the Edinburgh Handedness Inventory (Oldfield, 1971) (mean ± SD laterality quotient, 94.62 ± 10.85). Participant intelligence quotient (IQ) was estimated with the Wechsler Abbreviated Scale of Intelligence (WASI, Harcourt Assessment): mean ± SD IQ, 114 ± 11. Exclusion criteria were psychiatric or neurological disorders, learning disability, current or past drug abuse, head injury, and psychotropic medication.
Participants scored below clinical thresholds on the 28-item General Health Questionnaire (GHQ-28) (Goldberg and Hillier, 1979), an indicator of general quality of life, used to identify the presence of psychiatric symptoms (Goldberg et al., 1997) (group mean ± SD for total score, 2.16 ± 2.52). Participants were also administered the Barratt Impulsivity Scale (BIS-11), a multifactorial self-report measure of trait impulsivity in three domains: attentional (mean ± SD, 15.63 ± 2.65), motor (mean ± SD, 21.11 ± 4.23), and nonplanning (mean ± SD, 24.79 ± 5.55) (Patton et al., 1995); group mean ± SD for total score was 61.53 ± 10.13. The nonplanning subscale of the instrument (BIS-11NP) was used to test for associations between self-reported cognitive trait impulsivity and performance, as well as brain activation, using Pearson's correlation analyses. Our hypothesis concentrated on the nonplanning subscale of the instrument, because it is the nonplanning dimension of the more general construct of “impulsiveness,” which is of relevance in the IGT (Bechara and Van Der Linden, 2005). For example, Bechara and colleagues have demonstrated a correlation between BIS-11NP scores and (impaired) IGT performance in patients with attention deficit hyperactivity disorder (ADHD) (Malloy-Diniz et al., 2007), in contrast with the motor and attentional subscales that did not show an association with task performance, and are considered distinct dimensions of impulsiveness (Patton et al., 1995; Moeller et al., 2001).
All participants gave written informed consent and received £30 compensation for their participation. The study was approved by the local research ethics committee.
Gambling task
Participants were acclimatized to the scanner environment in a “mock” scanner, in which they practiced the task they were going to perform in an environment similar to the scanner facility. The practice session consisted of 12 trials that presented equal payoffs across all decks, and this difference between the training and experimental sessions was made explicit to participants.
The experimental gambling task used in this study is a computerized variant of the IGT (Bechara et al., 1994) and an additional modification of a recent fMRI adaptation of the task, described in detail previously (Lawrence et al., 2009). Briefly, participants were presented with four decks of cards (labeled A, B, C, and D) on a computer screen and were asked to select any one of the decks by pressing with their right hand one of four buttons, arranged horizontally on an MR-compatible button box to correspond with the four decks. They were administered 80 presentations of the decks, with the instruction to win as much money as possible and lose as little as possible. Participants were unaware of how many trials they would perform or how long the testing session would last. Decks A and B gave relatively large gains (£190, £200, or £210) but even larger losses (£240, £250, or £260), whereas decks C and D gave small gains (£90, £100, or £110) but even smaller losses (£40, £50, or £60). There was a 50% probability of winning or losing on each deck. Consequently, decks A and B were disadvantageous (also referred to as “risky”) because they led to overall loss, whereas decks C and D were advantageous (also referred to as “safe”) because they led to overall gain at the end of the task. Performance on the task is summarized by the subject's net preference score, i.e., the number of cards picked from the advantageous decks (C + D) minus the number of cards picked from the disadvantageous decks (A + B), with a high positive net score denoting the development of a preference for the advantageous relative to the disadvantageous decks. Net score was calculated for each of four blocks of 20 trials. The effect of block on net score was tested using within-subjects repeated-measures ANOVA.
A critical difference of the current version compared with the fMRI adaptation by Lawrence et al. (2008) is the temporal separation of the choice response from its outcome. This allowed us to hemodynamically decouple, and therefore examine separately, the moment of decision and the moment of outcome evaluation. Each trial is therefore divided into the following: (1) the choice phase, from the moment of presentation of the four decks until the execution of the choice [reaction time (RT) to button press]; (2) a 6 s delay between choosing a deck and being presented with the outcome (win or loss); and (3) the outcome evaluation phase, a 3 s presentation of the outcome on screen. After each choice, the four decks remained on the screen, and the deck chosen by the participant was superimposed with a wheel divided into 12 equal segments; every 0.5 s, each consecutive segment filled with color, counting down to outcome presentation. Trials lasted 15 s, ending with a blank screen after outcome presentation, a period that served as an implicit baseline in the fMRI analysis. The maximum time allowed for a response was programmed to 6 s. If a response was omitted, the trial was programmed to progress directly to the blank screen for 9 s (making up the total trial time of 15 s). Omitted trials were excluded from the analysis. Intertrial intervals (ITIs) varied to maintain a 15 s trial duration. The length of each ITI was determined by the response (i.e., the RT, which jittered trial events) as follows: RT + anticipation + feedback + ITI = trial length ⇒ RT + 6 s + 3 s + ITI = 15 s ⇒ ITI = 6 s − RT.
These manipulations significantly lengthened the duration of trials and of the whole task session. We therefore used 80 trials instead of the usual 100 (Bechara et al., 1994; Lawrence et al., 2009), to limit the duration of the task to a tolerable 20 min.
After completion of the experimental session, participants were asked whether they had picked more cards from any particular deck(s) or whether they had avoided any particular deck(s) and why. Participants were considered aware of the general contingencies of the task if they could report that decks A and/or B would “mostly lose,” or “made you lose more money than you won,” and/or that decks C and/or D would “mostly win,” or “let you win more money than you lost.”
Functional magnetic resonance imaging
Acquisition
Gradient echo-planar magnetic resonance imaging data were acquired on a GE Signa 3 tesla system (General Electric) at the Centre for Neuroimaging Sciences, King's College London, using a semiautomated image quality-control procedure. A quadrature birdcage head coil was used for radio frequency transmission and reception. In each of 22 noncontiguous planes, 800 T2*-weighted MR images depicting blood oxygen level-dependent (BOLD) contrast covering the whole brain were acquired [echo time (TE), 30 ms; repetition time (TR), 1.5 s; flip angle, 60°; in-plane resolution, 3.75 mm; slice thickness, 5.0 mm; slice skip, 0.5 mm]. A whole-brain high-resolution structural scan (inversion recovery gradient echo planar imaging; TE, 40 ms; TR, 3 s; flip angle, 90°; 43 slices; slice thickness, 3.0 mm; slice skip, 0.3 mm) was also acquired on which to superimpose the activation maps.
Data analysis
The fMRI data were analyzed with the XBAM software developed at the Institute of Psychiatry (http://www.brainmap.co.uk). The software uses a nonparametric permutation-based strategy rather than normal theory-based inference to minimize assumptions, and uses median rather than mean-based statistics to control for outlier effects. Finally, its most commonly used test statistic (supplemental Methods, available at www.jneurosci.org as supplemental material) is computed by standardizing for individual differences in residual noise before embarking on second-level multi-subject testing using robust permutation-based methods. This allows a mixed-effects approach to analysis, an approach that has been recommended recently following a detailed analysis of the validity and impact of normal theory-based inference in fMRI in a large number of subjects (Thirion et al., 2007). Individual- and group-level analyses are described in detail in supplemental Methods A and B, respectively (available at www.jneurosci.org as supplemental material).
Briefly, the fMRI data were realigned to minimize motion-related artifacts and smoothed using a Gaussian filter (full-width at half-maximum, 8.82 mm) (Bullmore et al., 1999). Time-series analysis of individual subject activation was performed with a wavelet-based resampling method described previously (Bullmore et al., 2001): we first convolved each experimental condition (choice contrasts: disadvantageous/advantageous choices; outcome contrasts: wins/losses), with two Poisson model functions (delays of 4 and 8 s). Using rigid body and affine transformation, the individual maps were registered into Talairach standard space. Group brain activation maps were then produced for each experimental condition, and hypothesis testing was performed by a cluster-level analysis method, shown to give excellent clusterwise type I error control (Bullmore et al., 2001). Time-series permutation was used to compute the distribution of the statistic of interest under the null hypothesis. The voxel-level threshold was first set to 0.05 to give maximum sensitivity and to avoid type II errors. Next, a cluster-mass threshold was computed from the distribution of cluster masses in the wavelet-permuted data, such that the final expected number of type I error clusters under the null hypothesis was less than one per whole brain. Group maps of the main contrasts for the choice and outcome phases are presented in supplemental Figure 1 (available at www.jneurosci.org as supplemental material).
Whole-brain correlation of brain activation with performance.
Performance deviations in the IGT are often interpreted as evidence for ventral prefrontal, or, more recently, amygdala dysfunction, to correspond with data from lesion studies (Dunn et al., 2006). Given the complexity of the task and the scarcity of neurophysiological evidence, this may, in many cases, represent an oversimplified extrapolation of associated pathology. Importantly, performance of pathological groups is compared with that of groups of “normal” healthy participants, with little consideration for individual differences in task engagement, strategy, risk aversion, or reward sensitivity. These are some of the factors that may influence an individual's performance on the task (Suzuki et al., 2003; Dunn et al., 2006), and there is marked variability in performance among both healthy and pathological populations, with subgroups of healthy adult participants failing to exhibit a preference for the advantageous options (Bechara and Damasio, 2002; Dunn et al., 2006; Cella et al., 2007).
Importantly, comparing advantageous and disadvantageous trials in the IGT does not allow the examination of neural substrates that are modulated throughout the testing session as participants sample the task contingencies that are likely to drive individual performance. Assessing the relationship of individual performance with neural activation is therefore likely to be more informative. It is characteristic that such analyses, compared with simple contrasts, have implicated wider networks of areas in neuroimaging studies with the task (Ernst et al., 2002; Bolla et al., 2003, 2005; Lawrence et al., 2009).
Therefore, in this study, we aimed to identify brain activation that was related to advantageous performance in all basic contrasts (advantageous/disadvantageous choices; win/loss outcomes). Whole-brain regression analysis using cluster-level permutation statistics was performed, identifying brain regions in which the magnitude of brain activation during the different contrasts correlated with performance (i.e., net score). Less than one false-activated cluster was expected at a p value of p < 0.05 for voxel and p < 0.01 for cluster comparisons. Additional details of the analysis are provided in supplemental Methods C (available at www.jneurosci.org as supplemental material).
Modulation of brain activation by expected value of decision.
To characterize brain regions in which activation during outcome evaluation was driven by previous experience, we constructed the individual performance history of each subject by calculating the expected value (EV) of each decision based on the subject's previous experience with that particular deck. EV was calculated, according to probability theory (Glimcher, 2004; Glimcher and Rustichini, 2004), for each trial (t) where, for example, deck D was chosen, as the sum of the previous probabilities of winning (pwin) or losing (ploss) after choices of the same deck (D), weighted by the average value (Meanwin or Meanloss) of outcomes from that deck: EVt = (pwinD × MeanwinD) + (plossD × MeanlossD).
Given the values of each deck [averaging −£50 in the disadvantageous decks and +£50 in the advantageous decks (Lawrence et al., 2009)] and the 0.5 probability of a win or loss for each choice, after sufficient sampling, EV tends to −25 for either of the disadvantageous decks (A or B) and +25 for either of the advantageous decks (C or D). With increased sampling then, the EV of disadvantageous decisions will be negative, and the EV of advantageous decisions will be positive; indeed there was a highly significant correlation between net score and the ratio of chosen decks with a positive EV (r = 0.99; p < 0.001). Despite this observation, it is important to note that this formulation is not a predictive model of decision-making behavior but simply defines the “context” of outcome presentation, in terms of the expected value of the response that caused it; it was used to inform the fMRI analysis to identify brain areas in which activation during outcome presentation was sensitive to the expectancy of associated decisions.
Trials carrying a positive EV are shorthanded as decisions made under “positive expectancy,” i.e., favorable outcomes were mostly expected, based on previous outcomes on the chosen deck. Similarly, trials carrying a negative EV are considered decisions made under “negative expectancy.” To dissociate brain areas in which activation was sensitive to positive and negative expectancy (i.e., to fluctuations in both the magnitude and valence of the outcome history of choices), we split EV into PosExp (by setting trials in which EV < 0 to 0) and NegExp (by setting trials in which EV > 0 to 0), interpolated the trial EV values across time points, and convolved each resulting function with the model of the hemodynamic response. Individual activation maps were recalculated by testing the goodness-of-fit of this convolution with the BOLD time series (see supplemental Methods A, Individual level analysis, available at www.jneurosci.org as supplemental material).
Conjunction of brain activation correlated with adaptive performance and associated with both the choice and outcome phases of the task.
We aimed to characterize brain areas responsible for integrating expectancy-modulated outcome evaluation into a neural discrimination between advantageous and disadvantageous choices, in a manner predictive of adaptive performance. For this purpose, we performed a conjunction analysis to identify regions in which activation was correlated with net score, both (1) in the contrast of disadvantageous and advantageous decisions in the choice phase, and (2) in the EV-modulated contrast of wins and losses in the outcome phase. Notably, the assignment of a 0.5 probability of winning or losing regardless of the deck chosen ensures the independence of the contrasts during the choice and outcome phases.
Group maps of the whole-brain linear correlation of activation with net score were produced for each contrast of interest (choice contrast: advantageous vs disadvantageous; EV-modulated outcome contrast: win vs loss), as described above. These yielded the clusters in which the difference in activation (in either direction) during advantageous and disadvantageous choices was predictive of adaptive performance and the clusters in which the difference in EV-modulated activation (in either direction) during win or loss outcomes was predictive of adaptive performance.
We established the overlap in activation between these performance-correlated choice and EV-modulated outcome clusters, by deriving the voxelwise conjunction of the group maps produced by each whole-brain analysis of correlation with net score [(choice contrast correlated with net score) ∩ (EV-modulated outcome contrast correlated with net score)]. This operation yielded the areas in which the voxelwise correlation with net score was significant for both tests (p < 0.05), namely (1) disadvantageous versus advantageous choices, and (2) wins versus losses as modulated by EV (Nichols et al., 2005). Activation in these voxels, then, showed expectancy-modulated differential sensitivity to the valence of outcomes and mediated the discrimination of disadvantageous from advantageous choices, both in a pattern predictive of adaptive performance.
To test whether these overlap clusters were specific to EV sensitivity or whether they were responsive to outcome valence regardless of EV, we repeated the conjunction of the choice and outcome tests, this time using the unmodulated outcome contrast: [(choice contrast × net score) ∩ (outcome contrast × net score)]. Clusters that survived the first but not the second conjunction analysis were considered to be specifically EV-sensitive in their involvement in outcome evaluation.
Results
Task performance
A robust spread of shifting toward advantageous options was observed across the whole group within the 80 trials used, comparable with the group levels reported in previous studies (Lawrence et al., 2009). As a group, participants shifted their choice preference toward the “advantageous” (C and D) decks over the testing session (Fig. 1) (repeated-measures ANOVA, effect of trial block on net score, F(3,54) = 3.49, p < 0.05); contrast analysis revealed that a linear function explained the change in net score over trial blocks (F(1,18) = 8.78, p < 0.005) (Furr and Rosenthal, 2003), in line with previous findings (Lawrence et al., 2009). Age and IQ did not affect performance (age correlation with net score, r = 0.08, p = 0.74; IQ correlation with net score, r = 0.38, p = 0.11). No significant correlations were observed between task performance and the total or subscale scores of the BIS-11.
Performance measure. Average net score for all participants across four consecutive blocks of 20 trials. Error bars represent SEM.
The rate and distribution of shifting to advantageous options was comparable with previous studies (Lawrence et al., 2009), taking into account the fact that here we used 80 trials instead of the usual 100 to calculate the net scores: total net scores ranged from −10 to 58 (mean ± SD, 22.32 ± 22.50). Sixteen of the subjects (84%) achieved a positive net score (range, 1–58; mean ± SD, 27.56 ± 20.48), and three subjects (16%) achieved a negative net score (range, −10 to −2; mean ± SD, −5.67 ± 4.04). Nine subjects (47%) could explicitly describe the contingencies of the available options (range of net scores, 26–58; mean ± SD, 43.33 ± 10.68), whereas 10 subjects (53%) could not describe the contingencies (range of net scores, −10 to 20; mean ± SD, 3.40 ± 8.50).
fMRI activation results
Choice phase
Making advantageous versus disadvantageous decisions.
Whole-brain correlation with net score of the disadvantageous > advantageous contrast revealed significant positive correlations in right vPFC, dlPFC, premotor cortex, left parietal cortex, and right parahippocampal gyrus (Fig. 2, red palette) (supplemental Table 1A, available at www.jneurosci.org as supplemental material). The opposite contrast (advantageous > disadvantageous) revealed correlations with net score in a widespread network of areas, including dorsomedial PFC (dmPFC) and left vlPFC and dlPFC, ACC, posterior cingulate cortex (PCC), motor cortex, postcentral gyrus, bilateral temporal cortex, basal ganglia, thalamus, amygdala, hippocampus, and cerebellum (Fig. 2, blue palette) (supplemental Table 1B, available at www.jneurosci.org as supplemental material).
Choice phase: correlation of activation with adaptive performance. Whole-brain correlation of activation with net score for the disadvantageous > advantageous contrast (red–yellow palette) and for the advantageous > disadvantageous contrast (blue–green palette). Three-dimensional clusters of performance-correlated activation for the two contrasts are presented superimposed on horizontal slices, marked with the z-coordinate as distance in millimeters from the anterior–posterior commissure.
Outcome phase
Winning versus losing.
Whole-brain correlations with net score of the outcome contrast between wins and losses showed that adaptive performance was associated with increased activation during wins compared with losses in areas including left dlPFC, dmPFC, PCC, thalamus, and midbrain (Fig. 3A, red palette) (supplemental Table 2A, available at www.jneurosci.org as supplemental material). Increased activation during losses was associated with better performance in primarily right dlPFC, right parietal and left temporal cortex, parahippocampal gyrus, and precuneus (Fig. 3A, blue palette) (supplemental Table 2A, available at www.jneurosci.org as supplemental material).
Outcome phase: correlation of activation with adaptive performance. Whole-brain correlation of activation with net score of the contrasts win > loss (red) and loss > win (blue): A, across all trials; B, for areas sensitive to the level of EV of the trial, when EV was positive (positive expectancy); C, for areas sensitive to the EV of the trial when EV was negative (negative expectancy). Three-dimensional clusters of performance-correlated activation for the two contrasts are presented superimposed on horizontal slices, marked with the z-coordinate as distance in millimeters from the anterior–posterior commissure.
Expectancy-modulated evaluation of wins and losses.
We found that dlPFC, dmPFC, caudate, and occipitoparietal responses to wins were modulated by expectancy (Fig. 3B,C) (supplemental Table 2B,C, available at www.jneurosci.org as supplemental material). The network of areas that responded to losses compared with wins under positive expectancy was similar to the main (unmodulated) contrast but attenuated. In addition, under positive expectancy, there was no evidence of involvement in loss processing in ACC or dmPFC (Fig. 3B) (supplemental Table 2B, available at www.jneurosci.org as supplemental material). Conversely, areas responsive to losses compared with wins under negative expectancy included extensive right inferior prefrontal activation, extending into vlPFC and insular cortices, and activation in bilateral lateral PFC, ACC, dmPFC and pre-SMA, as well as ventral striatum and midbrain (Fig. 3C) (supplemental Table 2C, available at www.jneurosci.org as supplemental material).
Conjunction of performance-predictive activation during choice and outcome phases
We aimed to characterize the expected performance-predictive neural overlap between activation involved in outcome evaluation and activation that came to discriminate between disadvantageous and advantageous choices. Voxels in which the whole-brain correlation of activation with net score was significant for both the choice phase contrasts (advantageous vs disadvantageous) and the EV-modulated outcome contrasts (win vs loss) are listed in Table 1. Activation in right vmPFC and dlPFC responded more to disadvantageous compared with advantageous choices and to losses more than wins under negative expectancy. Activation in bilateral dorsomedial frontal cortex and right ACC responded more during advantageous compared with disadvantageous choices but were also preferentially sensitive to losses compared with wins under negative expectancy. All other areas, in predominantly left superior frontal and inferior parietal cortices, insula, caudate and thalamus, and right superior temporal gyrus, responded more during advantageous compared with disadvantageous choices and during wins compared with losses under positive expectancy.
Conjunction of whole-brain correlations of choice and EV-modulated outcome contrasts with net score
Association of vmPFC activation with self-reported trait impulsivity
Based on associations of aberrant IGT performance and impulsive behavior (Cavedini et al., 2002; Toplak et al., 2005; Geurts et al., 2006; Malloy-Diniz et al., 2007; Verdejo-Garcia et al., 2007; van der Plas et al., 2008), not least in patients with vmPFC lesions (Bechara et al., 1994, 1998, 2000a), and previous fMRI demonstrations of vmPFC involvement in performance of the task (Fukui et al., 2005; Lawrence et al., 2009), we hypothesized an association between vmPFC activation and individual differences in levels of self-reported trait cognitive impulsivity and self-control, as measured with the nonplanning subscale of the BIS-11 (BIS-11NP). To test this hypothesis, we extracted the standardized BOLD response values [sum of squares (SSQ) ratios] of activation in the vmPFC region identified in the conjunction analysis (Table 1) from the group maps of the choice and outcome contrasts. Pearson's correlation analyses were then conducted between the SSQ values and BIS-11NP scores.
Differential activation during wins and losses in the vmPFC was significantly correlated with scores in the nonplanning subscale of the BIS-11 (Talairach coordinates: x, y, z = 21, 35, −7; r = −0.47, p = 0.044) (Fig. 4): low scores were associated with increased activation during losses compared with wins. Activation in this cluster during the choice phase was not correlated with BIS-11NP scores.
Association of vmPFC activation during outcome presentation, with self-reported trait impulsivity. Significant negative correlation (r = −0.5, p < 0.5) between statistical BOLD response (SSQ ratio) in vmPFC during outcome presentation and individual scores on the nonplanning subscale of the Barratt Impulsivity Scale (BIS-11NP). Higher BIS-11NP scores indicate increased trait impulsivity. Positive SSQs denote increased activation during wins compared with losses, and negative SSQs denote increased activation during losses compared with wins.
Discussion
We aimed to further elucidate the processes underlying adaptive decision making. Using an IGT variant temporally separating decision making and evaluative processes, we used a conjunction analysis to investigate brain areas that mediate preference shifts toward advantageous decisions by integrating outcome evaluation into the computation of choice utility.
We isolated areas that mediate adaptive performance by discriminating disadvantageous from advantageous options through modulating their responsiveness to the valence of outcomes according to the expected value of the decision that preceded them. vmPFC and dlPFC responded more during disadvantageous choices and during losses under negative expectancy, suggesting that they mediate shifting away from disadvantageous choices through their sensitivity to accumulating negative outcomes. Right vmPFC activation furthermore correlated negatively with cognitive impulsivity. dmPFC and ACC responded more during advantageous choices and were more sensitive to losses under negative expectancy, suggesting they potentiate adaptive decisions through processing negative outcomes. Last, superior frontal, temporal, and parietal cortices, insula, caudate, and thalamus responded more during advantageous choices and during wins under positive expectancy, suggesting that they potentiate adaptive decisions through confirming expected positive outcomes.
In right vmPFC and dlPFC, activation increases to disadvantageous alternatives were correlated with performance. In the same areas, the level of negative expectancy of the chosen deck drove increased responding to losses. This neural overlap represents a mechanism through which punishment information can flag potentially unfavorable actions, biasing future behavior away from them. mPFC activation has been associated with adaptive IGT performance (Fukui et al., 2005; Lawrence et al., 2009), a finding supported by lesion studies (Bechara et al., 1994, 1999, 2000a). Here we further elucidate its involvement, in expectancy-modulated processing of negative outcomes, that presumably feedback into the decision-making process. Similarly, dlPFC involvement in this process extends findings from PET (Ernst et al., 2002) and lesion (Bechara et al., 2000b; Manes et al., 2002) studies with the IGT, the cognitive complexity of which recruits the dlPFC in processes of working memory and temporal foresight (Bechara et al., 1998, 2000b; Rogers et al., 1999; Manes et al., 2002; Adinoff et al., 2003). We demonstrate the concerted involvement of these regions in the IGT, mediating affective and attentional shifts away from disadvantageous options, through expectancy-dependent outcome evaluation, necessary for adaptive performance (Dias et al., 1996; Fellows and Farah, 2003; Brand et al., 2006; Dunn et al., 2006; Fellows, 2007). Both the ventromedial and dorsolateral foci were right lateralized, a finding that—given that our participants were male—is in line with suggestions that males rely on right and females on left prefrontal engagement during IGT performance (Tranel et al., 2005).
Right vmPFC activation was specifically sensitive to losses under negative expectancy. Among accumulating demonstrations of differential processing of reinforcement valence (Liu et al., 2007; Wheeler and Fellows, 2008), specific evidence proposes a specialized role for the vmPFC, especially in the right hemisphere (Clark et al., 2003), in learning from punishment (Kim et al., 2006; Wheeler and Fellows, 2008), possibly supporting complex mental experiences such as regret and counterfactual processing (Camille et al., 2004), as well as “rational” decision making and resistance to “framing” effects (De Martino et al., 2006). Our results support this, adding weight to suggestions that wins and losses differentially affect learning in the IGT (Dunn et al., 2006; Fellows, 2007). We propose that vmPFC and dlPFC integrate increasingly expected negative outcomes into the discrimination of disadvantageous options. This formulation is compatible with models of complex decision making, suggesting that vmPFC mediates learning driven not by discrete reinforcer values but by the “state” reinforcers confer through the context and predictability of their presentation (Hampton et al., 2006).
Other areas of activation overlap between choices and outcomes showed associations between performance and increased activation during advantageous decisions, suggesting their involvement in potentiating actions congruent with long-term payoffs. These were primarily left lateralized, in the pre-SMA, insula, caudate, thalamus, and temporal and parietal cortex, and may strengthen the attentional and mnemonic mapping of long-term action outcomes, guiding adaptive future behavior. Characteristically, activation in most of these areas was sensitive to positive expectancy but also responded more to wins than losses regardless of expectancy, in line with the behavioral potentiation hypothesis. The role of these regions in temporal foresight is supported by evidence of superior frontal, insular, parietal, and caudate activation in related tasks, such as temporal discounting (McClure et al., 2004; Hariri et al., 2006; McClure et al., 2007; Wittmann et al., 2007; Rubia et al., 2009a).
Important exceptions to positive outcome sensitivity were in pre-SMA and ACC, in which activation was sensitive to negative expectancy, through either increased (confirmatory) responses to losses or attenuated responses to wins when disadvantageous outcomes were expected. This way, negative expectancy can drive behavioral preference for favorable options. This is in line with previously reported ACC involvement in error monitoring (Carter et al., 1999; Rubia et al., 2003; Walton et al., 2007) and ACC sensitivity to the previous probability that the ongoing action is incorrect (Hampton et al., 2006), as well as proposals of ACC and pre-SMA involvement in using decision outcomes and action values to modulate attention to available response options (Rushworth et al., 2004; Kennerley et al., 2006; Hayden et al., 2008) and in organizing voluntary action (Rushworth et al., 2004; Haggard, 2008).
Performance and activation correlations during the choice phase were also observed in PCC and precuneus. The PCC is reciprocally connected with the ACC (Kobayashi and Amaral, 2003), which monitors action outcomes to support learning of action values (Kennerley et al., 2006), and the parietal cortex, which directs visual attention (Kastner and Ungerleider, 2000). Neurons in the primate PCC respond to the relative value of outcomes depending on previous expectations, with activity persisting into future trials, predicting preference shifts in a gambling task (Hayden et al., 2008). The precuneus is reciprocally connected with the PCC, sending projections to ACC and striatum, among other areas, including the ventral PFC. It has been implicated in self-referential processing and directing attention during goal-directed movements, as well as in the absence of overt responding (Cavanna and Trimble, 2006). Interactions between these areas and frontostriatal circuits may thus underlie the integration of choice values with attentional modulation in the task.
Having separated the moment of decision from the presentation of outcomes, we also demonstrate that wins and losses are differentially processed in this task, in line with previous findings in other decision-making and reward paradigms (Elliott and Dolan, 1999; Frank et al., 2004; Liu et al., 2007). Specifically, adaptive performance was predicted by sensitivity to positive expectancy during outcome presentation in lateral PFC, ACC, dorsal striatum, and pre-SMA and by sensitivity to negative expectancy in vmPFC, insula, and ventral striatum. The demonstration of context-sensitive involvement of these regions in outcome evaluation in the IGT is in line with a long line of evidence from human and nonhuman studies (Kawagoe et al., 1998; Tremblay and Schultz, 1999; Elliott et al., 2000; Delgado et al., 2003, 2004, 2005; Matsumoto et al., 2003; Wallis and Miller, 2003; Rogers et al., 2004; Nieuwenhuis et al., 2005; Liu et al., 2007). Importantly, we show that, of the areas that adaptively evaluate outcomes, it is the vmPFC and the dlPFC, both sensitive to negative expectancy, that are also involved in the successful discrimination of risky from safe choices.
These are key structures in the influential “somatic marker” (SM) framework (Bechara and Damasio, 2005). We did not specifically aim to examine the development of somatic markers or their influence on choice behavior. Nevertheless, our findings are consistent with the neurophysiological substrates of the framework. The prominent influence of punishment on the areas we describe is consistent with the integration of affective (e.g., insula), working memory (e.g., dlPFC/caudate) and action control (e.g., ACC/caudate) aspects of decision making under ambiguity through vmPFC mediation. This formulation of our own data, however, despite echoing the SM proposal, does not presuppose or require the development of somatic markers and would only be “downstream” of their proposed mediation of affective associations in the task.
The vmPFC activation identified in our conjunction analysis was negatively associated with self-reported nonplanning trait impulsivity during responses to losses compared with wins. Therefore individual differences in trait impulsivity are related to vmPFC reactivity to reinforcement, specifically during punishment. This elucidates associations between impulsivity, poor IGT performance, and vmPFC lesions (Bechara et al., 1994, 1998, 2000) or pathologies (Blair, 2004; Boes et al., 2008; Huebner et al., 2008). Such pathologies include ADHD and antisocial behavior, characterized by impaired decision making attributable to insensitivity to the magnitude of punishments (Finger et al., 2008; Luman et al., 2008) and reduced activation in vmPFC during reward-related tasks (Rubia et al., 2009b).
We provide direct evidence for the nature of the long-standing vmPFC and dlPFC implication in IGT performance by showing their utilization of expected negative outcomes to drive performance shifts away from disadvantageous options. These results align neuroimaging data from this prolific neuropsychological tool with data from more experimentally principled decision-making paradigms. They also demonstrate, however, the involvement of additional brain areas, proposed to bind affective and attentional modulation to behavioral output, potentially overcoming individual tendencies in domains such as impulsivity and neural responsivity to reinforcement. Imbalances in the relative activity of these areas may lead, through multiple pathways, to aberrant decision making, decoupling adaptive outcome evaluation from temporal foresight.
Footnotes
-
The study was supported by Medical Research Council Grant GO300155 (K.R.). We thank Dr. Natalia Lawrence for allowing us to modify her task version and for advice during the early stages of this project, Jeff Dalton for programming, and the staff at the Centre for Neuroimaging Sciences for their help and expert advice.
- Correspondence should be addressed to Dr Anastasia Christakou, Institute of Psychiatry, King's College London, Box P046, De Crespigny Park, London SE5 8AF, UK. anastasia.christakou{at}kcl.ac.uk