Abstract
Sleep deprivation (SD) has detrimental effects on cognition, but the affected psychological processes and underlying neural mechanisms are still essentially unclear. Here we combined functional magnetic resonance imaging and computational modeling to examine how SD alters neural representation of specific choice variables (subjective value and decision conflict) during reward-related decision making. Twenty-two human subjects underwent two functional neuroimaging sessions in counterbalanced order, once during rested wakefulness and once after 24 h of SD. Behaviorally, SD attenuated conflict-dependent slowing of response times, which was reflected in an attenuated conflict-induced decrease in drift rates in the drift diffusion model. Furthermore, SD increased overall choice stochasticity during risky choice. Model-based functional neuroimaging revealed attenuated parametric subjective value signals in the midbrain, parietal cortex, and ventromedial prefrontal cortex after SD. Conflict-related midbrain signals showed a similar downregulation. Findings are discussed with respect to changes in dopaminergic signaling associated with the sleep-deprived state.
Introduction
Sleep deprivation (SD) has detrimental effects on a range of human cognitive functions, including working memory, learning, attention, and decision making (Durmer and Dinges, 2005; Ratcliff and Van Dongen, 2009; Whitney and Hinson, 2010). However, although sleep-loss induced impairments in cognition result in considerable societal costs, the mechanisms through which SD affects cognition remain poorly understood. For example, the overall findings regarding the effects of SD on decision making are mixed. Some studies report decreases in risk-taking behavior after SD (Killgore, 2007), whereas others report increases (Killgore et al., 2006; McKenna et al., 2007) or no behavioral effects (Acheson et al., 2007; Venkatraman et al., 2007). SD may also induce a positivity bias in valence judgments of pictures (Gujar et al., 2011) and increase gain maximization during decision making involving complex mixed gambles (Venkatraman et al., 2011).
One increasingly adopted approach to examine how alterations in neural processing after sleep loss lead to cognitive changes is the use of functional neuroimaging (Chee and Chuah, 2008), and a number of recent studies have applied this approach to reward processing and decision making (Venkatraman et al., 2007, 2011; Gujar et al., 2011; Libedinsky et al., 2011). Dopamine (DA), which plays a central role in these processes (Cools, 2008; Kable and Glimcher, 2009; Haber and Knutson, 2010), is upregulated after SD (Volkow et al., 2008, 2009). Also, wakefulness-promoting medications are thought to act partly through enhanced DA transmission (Boutrel and Koob, 2004), suggesting that DA release may function as an endogenous mechanism to maintain arousal during SD (Volkow et al., 2008) and might mediate associated antidepressant effects (Ebert et al., 1994; Gillin et al., 2001). SD might therefore affect decision making through a modulation of dopaminergic structures.
Indeed, the dopaminergic system, in particular the dopaminergic midbrain [encompassing the substantia nigra and ventral tegmental area (SN/VTA)], encodes crucial choice parameters (Kable and Glimcher, 2009), including the value (Schultz et al., 1997) and saliency (Bromberg-Martin et al., 2010) of decision options. However, because previous imaging studies on reward-based decision making after SD have not directly examined parametric value and control signals (Venkatraman et al., 2007, 2011; Libedinsky et al., 2011), the extent to which such neural representations are altered in the sleep-deprived state remains unclear.
The aims of the present study were therefore (1) to more comprehensively examine behavioral alterations in decision making after SD through computational modeling and analysis of reaction time (RT) distributions and (2) to examine neural representations of crucial decision variables (value and decision conflict) after SD using model-based functional magnetic resonance imaging (fMRI). Twenty-two healthy male volunteers were scanned in a within-subjects repeated-measures counterbalanced design, during rested wakefulness control (CON) and after 24 h of SD, while they performed a risky choice task [probability discounting (Green and Myerson, 2004)] that reliably activates dopaminergic structures, including the ventral striatum and SN/VTA (Peters and Büchel, 2009). Using model-based fMRI, we then examined how value and conflict representations are modulated after SD.
Materials and Methods
Participants.
Healthy male human volunteers (n = 22; mean ± SD age, 26.6 ± 4.22 years) participated in two fMRI scanning sessions on separate days. They had no history of neurological or psychiatric disorders, and the study procedure was approved by the local ethics committee.
Overall and individually, subjects showed normal daytime sleepiness as assessed using the Epworth Sleepiness Scale (Johns, 1991) (mean ± SD, 7.77 ± 2.83) and normal sleep quality as assessed using the Pittsburgh Sleep Quality Index (PSQI) (Buysse et al., 1989) (mean ± SD, 3.85 ± 1.35 years). The average bedtime in the 4 weeks preceding the experiment was 00:10 A.M. (SD, 2 h 10 min). In a medical screening questionnaire, all participants reported no previous or current disturbances of sleep, and nightshift workers were excluded from the study.
Procedure.
An overview of the procedure is shown in Figure 1a. In short, subjects performed a risky choice task (Fig. 1b) during fMRI, once during rested wakefulness (control condition) and once after 24 h of SD, with the order of conditions counterbalanced across subjects. One group of participants (group 1) was scanned first in rested wakefulness (in the evening) and then after a single night of SD (in the morning). The other group (group 2) was scanned first after a single night of SD (in the morning) and then after a single recovery night (scanning again in the morning).
Specifically, 2 or 3 d before the first fMRI scanning session, all participants were invited to a meeting at the department where they were informed about the procedures, filled in a medical questionnaire, and gave written informed consent to participate in the study. Additionally, they performed a short adaptive probability discounting task to estimate the discount rate of each subject (Peters and Büchel, 2009). Although all participants underwent a night of SD, they could not prepare for that because it remained unknown to them whether they would spend the night sleeping or being awake.
The night of SD started between 8:00 P.M. and 9:00 P.M. each with a group of two or three participants. Half of the subjects were randomly assigned to participate in two fMRI sessions in rested wakefulness this very evening (between 9:00 P.M. and 11:00 P.M.). The other half spent the time waiting. During the night hours of wakefulness, the participants were under permanent visual monitoring. They were allowed to spend their time pursuing all activities they would normally do during evening hours, excluding physical and intense mental activities, and they had to restrain from alcohol and caffeine. Between 8:00 A.M. and 10:00 A.M. the following morning, the participants performed the two sessions of fMRI scanning in the condition of SD. Those participants who already performed the two sessions in rested wakefulness on the previous evening were rewarded by the monetary equivalent to one of their choices (one trial randomly picked from all four sessions). If on the selected trial subjects had chosen the reference reward of 20€, they received this amount. If they had chosen the risky reward, the gamble was performed, and subjects received either the larger reward or nothing.
All other participants were re-invited the next morning between 8:00 A.M. and 10:00 A.M. after a full-sleep recovery night to be tested in the condition of rested wakefulness and to obtain their reward (again one trial randomly selected from all four sessions; see above). During the recovery night, subjects slept on average 10 h 48 min as of self-report, i.e., on average, 2 h 40 min longer than their usual sleeping duration according to the PSQI. Subjects thus compensated for previous sleep loss, indicating that at least a certain degree of recovery took place. Cognitive performance (Dinges, 1990) and cerebral blood flow (Balkin et al., 2002) typically normalize during 20 min after waking (sleep inertia). To account for this, all subjects were scanned at least 1 h after waking after the recovery night.
Task.
Based on the estimated discount rate obtained at the meeting before the scanning sessions, subject-specific trials were constructed for the fMRI experiment, a procedure that is feasible because probability discounting is a stable trait in individual subjects over periods of up to 3–4 months (Ohmura et al., 2006; Peters and Büchel, 2009). During fMRI, subjects performed a previously described probability discounting task (Peters and Büchel, 2009) that involved a series of choices between an immediate certain reward (20€ with 100% probability) and larger but risky amounts (Fig. 1b). On each scanning day, subjects performed two sessions of 48 trials each, yielding a total scan time of ∼20 min (96 trials in total) per day. Amounts for the risky reward ranged from 20.5 to 80€, which were combined with probabilities of 99, 96, 84, 54, 28, and 17%.
Amounts were calculated for each subject and for each probability using the following procedure. First, for each probability (see above), the euro amount subjectively equal to a sure-gain reward of 20€ was estimated using the probability discount rate from the pretest (the indifference amount). Per probability, eight trials were created with amounts spaced evenly around this indifference amount with a range of ±4€ (i.e., linearly spaced vectors of amounts, ranging from −4 to +4€ around the indifference amount). Eight additional trials were constructed in which the amounts ranged from 20.5 to [75, 76, 77, 78, 79, 80] € (uniformly distributed, with the particular upper limit randomized across probabilities), yielding 16 trials per probability in total. Thus, for each of the six probabilities, eight trials were created relatively close to the estimated indifference point (yielding trials with relatively high decision conflict). The eight additional trials spanning the entire range of amounts (20.5 to ∼80€), conversely, yielded trials ranging from very low to very high values (e.g., 76€ with 99% or 20.5€ with 17%). Combined, this ensured sufficient variance in both value and conflict for parametric fMRI analyses.
We note that this procedure of constructing trials based on performance in the pretest may affect the accuracy of model parameters estimated from the fMRI session, depending on the change in risk preferences. If preferences change dramatically relative to the pretest, model parameters may be less accurate because the new indifference points may fall outside the ±4€ range used for trial generation. However, in such a case, the direction of change would still be reliably detected, and the remaining trials span the entire range of amounts.
Computational modeling.
We analyzed single-subject choice data separately for the control and sleep-deprived sessions using maximum-likelihood parameter estimation (Lewandowsky and Farrell, 2011) with optimization procedures (“fminsearch”) implemented in Matlab (MathWorks). We applied the soft max choice rule, to estimate the probability of choosing the actually chosen option on each trial, given the values of the available options (SVchosen and SVunchosen) calculated according to a particular model of risky choice. Because the sure-gain reference reward was fixed at 20€, this value was modeled as a constant. In line with our previous study (Peters and Büchel, 2009), the value of the probabilistic option was initially modeled as a hyperbolic function of the odds against winning the gamble: Here SV is the subjective discounted value, A is objective reward amount, k is a subject-specific discount rate, and ϕ = (1 − P)/P, where P is reward probability (i.e., odds against winning). This yields two free model parameters: the inverse temperature (Eq. 1, temp), corresponding to the stochasticity in the subjects' choices (i.e., the steepness of the sigmoid choice function) and the discount rate k (Eq. 2), where large values correspond to greater risk aversion and smaller values correspond to greater risk seeking. A probability discount rate of k = 1 entails risk-neutral behavior.
For comparison, we also fit our data with a different risky choice model, Prelec's probability weighting function (Prelec, 1998; Paulus and Frank, 2006; Takahashi et al., 2007, 2010; Hsu et al., 2009) in its dual-parameter implementation: Here, A is again the reward amount, and α is the degree of nonlinearity in the weighting function, with α = 1 implying linear weighting of probability (i.e., expected utility). β reflects the inflection point of the function, yielding three free parameters (α, β, and temp). Note that our task was not optimized to fully capture the inverse s-shape of this function (which reflects the overestimation of small and the underestimation of large probabilities), in particular because the lowest probability was 17% and an upper limit of 80€ on the reward magnitude was imposed, such that most subjects never selected the risky option in case of the 17% probability.
To obtain the best-fitting parameter estimates for each subject and model in each condition, we maximized the log-likelihood of the choice probabilities (using Matlab optimization functions) given a particular set of model parameters θ, summing across trials t: Goodness-of-fit was quantified using the Bayesian information criterion (BIC) (Schwarz, 1978): where n is the number of free model parameters, and t is the number of trials included in the fitting procedure. Note that smaller BIC values indicate better fits.
Using the best-fitting parameter estimates from each individual subject's models, two measures were then calculated for model-based fMRI analysis. First, to examine neural responses correlated with subjective value, we calculated the subjective value of the risky option for each trial using Equations 2 or 3. Second, decision conflict was quantified by calculating the probability of choosing the actually selected option on each trial (i.e., Pchosen from Eq. 1). Conflict was then defined as Pchosen for Pchosen ≤ 0.5 and as 1 − Pchosen otherwise, i.e., conflict was maximal in trials on which both options were associated with the same choice probability. In contrast, when the value difference between options was relatively large, conflict was primarily at 0. (Note that this regressor is negatively correlated with the difference in value between the chosen and unchosen options (mean ± SD correlation: rCON = −0.604 ± 0.161; rSD = −0.635 ± 0.156). However, crucial differences between the two regressors remain: the conflict regressor has greatest variance when the value difference between options is smaller (i.e., around the indifference points), whereas the chosen − unchosen regressor has greatest variance when the value difference is greater.
Analysis of RTs.
We analyzed RTs in two ways: (1) by directly examining RT distributions and (2) by applying the Ratcliff drift diffusion model (RDM) (Ratcliff, 1978; Ratcliff and McKoon, 2008). For the first analysis, RTs for each subject were grouped into 10 bins based on the subjective discounted value of the risky reward in relation to the 20€/100% reference reward. We then calculated the median RT for each bin in each subject and averaged these across subjects. We also examined whether second-order polynomials provided a significantly better fit to the RT data than a linear model via variance ratio tests (F tests) and fit second-order polynomials and linear functions to the binned RT data of each subject using Matlab.
In a second analysis, we applied the RDM (Ratcliff, 1978; Ratcliff and McKoon, 2008), which has previously been successfully used to examine cognitive effects of SD (Ratcliff and Van Dongen, 2009, 2011) and has recently been shown to accurately account for RT distributions in the context of value-based decision making (Milosavljevic et al., 2010). Using the RDM, RT distributions can be decomposed into underlying component processes (Ratcliff and McKoon, 2008). SD has recently been shown to affect multiple processes in a two-choice numerosity discrimination task (Ratcliff and Van Dongen, 2009), and, for comparison, we aimed to examine effects of SD on value-based choice using the RDM framework. Specifically, using the DMAT toolbox for Matlab (Vandekerckhove and Tuerlinckx, 2007, 2008), we fit a seven-parameter version of the RDM to the data, allowing all parameters to vary freely. For each subject, trials were grouped into hard trials (i.e., the 50% of trials in which the discounted value was closest to 20€) and easy trials (i.e., all other trials), separately for CON and SD. Responses consistent with participants maximum-likelihood model parameters were then classified as correct and inconsistent choices as incorrect (Milosavljevic et al., 2010). This yielded four conditions (CON easy, CON hard, SD easy, SD hard) with on average only <45 trials per subject and condition. We therefore fit the RDM model to aggregate data from all subjects, a procedure that is legitimate if model fit at the single-subject level is not feasible (Lewandowsky and Farrell, 2011). Using nonparametric bootstrapping (n = 100 samples), we obtained SEs for the model parameter estimates.
fMRI data acquisition.
MRI data were acquired on a 3 T system (Siemens TIM-TRIO) using a 32-channel head coil. For each session, 240 volumes were acquired using tilted acquisition to optimize imaging of the orbitofrontal cortex (OFC). Each volume comprised 40 axial slices (2 mm thickness, 50% gap) acquired using a T2*-sensitive gradient echo-planar imaging sequence (2.46 s TR, 26 ms TE, 30° slice tilt, 90° flip angle, 2 × 2 mm in-plane resolution, 220 mm FOV). An additional high-resolution structural scan (MPRAGE, 1 × 1 × 1 mm voxel size, 240 slices) was obtained for anatomical overlay.
fMRI data analysis.
All preprocessing and statistical analyses were performed using SPM8 (Wellcome Department of Cognitive Neurology, University College London, London, UK), separately for the CON and SD sessions. Images were slice-time corrected to the onset of the middle slice, spatially realigned using a six-parameter affine transformation and unwarped to account for effects of subject movement. The T1 structural image was coregistered to the functional images and segmented according to gray matter, white matter, and CSF. Functional images were then spatially normalized using the normalization parameters obtained from the segmentation procedure and smoothed with a Gaussian kernel of 8 mm full-width at half-maximum. Data analysis was performed using the general linear model as implemented in SPM8-4010. The presentation of the decision option and the response prompt were modeled by convolving the event train of stimulus onsets with the canonical hemodynamic response function. We created different first-level models, separately for the CON and SD conditions. Model 1 included the subjective discounted value of the risky option as a parametric regressor. Model 2 included the subjective decision conflict as a parametric regressor. Model 3 included subjective value of the risky option, orthogonalized with respect to conflict, whereas in model 4, conflict was orthogonalized with respect to the subjective value of the risky option. Therefore, by examining the orthogonalized parametric regressors in the latter models (i.e., subjective value in model 3 and decision conflict in model 4), we could test for variance that is uniquely accounted for by the respective variable. Model 5 included the subjective value of the chosen option as a parametric regressor. These models were created once using the parametric regressors based on the hyperbolic model and once based on the Prelec model. Because the conflict regressor was negatively correlated with the difference in value between chosen and unchosen options (i.e., SVchosen − SVunchosen, see subsection computational modeling above), we created two additional first-level models. In model 6, conflict was orthogonalized with respect to chosen − unchosen value. In model 7, chosen − unchosen value was orthogonalized with respect to conflict.
Single-subject contrast images were then taken to a second-level random-effects analysis using the full factorial design of SPM8. To address the possibility that the order of the control and SD conditions may have affected the results, we included task order (control/sleep-deprived) as a covariate of no interest in all second-level models. For all fMRI analyses, the statistical threshold was set to p < 0.05, corrected for multiple comparisons. For a priori regions of interest, correction was based on 12 mm spherical search volumes based on studies from other laboratories, i.e., independent data, indicated in the main text by psvc. Coordinates used were (±14, 8, −8) for ventral striatum (O'Doherty et al., 2004), (±9, −15, −15) for SN/VTA (Schott et al., 2006), and (±3, 42, −6) for ventromedial prefrontal cortex (vmPFC) (Chib et al., 2009). Otherwise, correction was based on the entire brain. Unless stated otherwise, statistical maps are thresholded at p < 0.001, uncorrected, with at least 10 contiguously activated voxels, and we report comprehensive imaging results at this uncorrected threshold for completeness.
Results
Behavioral data
Subjects responded within the 2 s response window on close to 100% of all trials in both conditions (mean ± SD, CON = 99.8 ± 0.0044%; SD = 98.1 ± 0.034%). The number of no-response trials was significantly greater in SD than CON (paired t test, t(21) = 2.53, p = 0.019) (Tucker et al., 2011).
Computational modeling
We initially modeled behavioral data (excluding no-response trials) using the hyperbolic model of probability discounting (Green and Myerson, 2004; Peters and Büchel, 2009). A probability discount rate of k = 1 entails risk-neutral behavior, whereas larger values reflect risk aversion and smaller values risk seeking. Most subjects were risk averse [median (interquartile range) of k values, pretest = 3.26 (2.63–4.46), CON = 3.03 (1.38–7.66), SD = 3.86 (1.65–9.91)]. After log transformation to account for their skewed distribution, k parameters were stable between CON and SD sessions (Fig. 2a, test–retest correlation of log(k), r = 0.88, p < 0.00001) and not significantly different between CON and SD (paired t test, t(21) = 1.53, p = 0.14). In contrast, the stochasticity parameter temp increased significantly after SD (mean ± SD, CON = 2.44 ± 1.08, SD = 3.36 ± 1.59; paired t test, t(21) = 2.21, p = 0.038, two tailed), and model fit was worse [BIC (Schwarz, 1978): BICCON = 980.85, BICSD = 1104.03].
For comparison, we also fit our data with the two-parameter formulation of Prelec's probability weighting function (Prelec, 1998; Takahashi et al., 2007). Model parameters were again stable between testing sessions (test–retest reliability α parameter, r = 0.70, p = 0.0006; β-parameter, r = 0.67, p = 0.0012), and α was significantly lower after SD (mean ± SD, CON = 0.73 ± 0.26, SD = 0.63 ± 0.26; paired t test, t(21) = 2.42, p = 0.024), whereas β showed no difference (mean ± SD, CON = 2.39 ± 1.84, SD = 2.20 ± 1.67; paired t test, p = 0.55). Testing for an increase in choice stochasticity after SD (as in the hyperbolic model) revealed a similar nonsignificant trend (paired t test, t(21) = 1.79, p = 0.086, two tailed). Again, the fit of the model was worse in SD (BICCON = 943.80, BICSD = 1039.35).
Thus, regardless of the particular model that was examined, the degree of choice stochasticity increased after SD (although this was only a nonsignificant trend for the Prelec model), and overall preferences showed test–retest stability. Furthermore, if anything, subjects were slightly more risk averse when sleep deprived. In the hyperbolic model, this was reflected in nonsignificantly lower k parameter (i.e., steeper discounting over odds against), whereas in the Prelec model, this was reflected in a significantly smaller α parameter (i.e., an increase in the nonlinearity of the probability weighting function).
As in previous studies (Takahashi et al., 2007), the Prelec model fit the data slightly better than the hyperbolic model, as indicated by the lower aggregate BIC scores (see above). To facilitate comparison with our previous results (Peters and Büchel, 2009), we initially retained the values obtained from the hyperbolic model for the additional analyses of RTs and parametric fMRI responses. To rule out that the imaging findings were dependent on the particular model examined, we also analyzed the neuroimaging data using results from the Prelec model.
Analysis of RTs
Overall, RTs were slower in SD (mean ± SD in ms: CON = 654 ± 86, SD = 697 ± 94; t(21) = −3.43, p = 0.002). For a first analysis, each participants' trials were grouped into 10 bins based on the value difference between the risky and the certain option [i.e., bins ranged from value(risky) ≪ value(sure) to value(risky) ≫ value(sure) in 10 steps], and RT distributions were examined. Average RTs for each value bin are plotted in Figure 2, b and c, along with the best linear and quadratic fits. RTs in the control condition were significantly better explained by a quadratic model than a linear model (variance ratio test, F(1,7) = 14.09, p = 0.0012; adjusted R2 linear = 0.35, quadratic = 0.75). In contrast, in the sleep-deprived state, the quadratic function did not provide a significant improvement in fit over the linear model (variance ratio test, F(1,7) < 1, p > 0.4; adjusted R2 linear = 0.78, quadratic = 0.74), suggesting the presence of an inverse U-shaped RT distribution in the control condition that was attenuated in the sleep-deprived state.
To complement this analysis, we binned trials into two bins (easy trials and hard trials) and fit the RDM (Ratcliff, 1978; Ratcliff and McKoon, 2008) to the aggregate data of all subjects (for details, see Materials and Methods). In CON, drift rates (which reflect the rate of evidence accumulation) were greater in easy compared with hard trials, reflecting the behavioral sensitivity to conflict. In the sleep-deprived state, this effect was considerably less pronounced (Fig. 3a). Similarly, the boundary separation parameter (greater values reflect increased accuracy) showed a more pronounced effect of conflict in CON than SD (Fig. 3b). Response bias was essentially unaffected by SD (Fig. 3c). Together, these analyses of RTs suggest an attenuation of conflict-dependent slowing in the sleep-deprived state.
FMRI data
First, to confirm normal alertness after the recovery night, all participants performed a 7 min continuous performance task [a bimanual letter-based go/no-go task with 105 go trials (two targets), 45 no-go trials (six targets), and 50 distractor letters] before scanning, which was compared with a sample of fully rested control participants. Neither RTs (p = 0.727) nor the proportion of hits (p = 0.411), misses (p = 0.833), or false alarms (p = 0.826) suggested any differences in performance. Second, we compared physiological measures between the two functional imaging sessions. Neither breathing rates (p = 0.2341) nor heartbeats per minute (p = 0.4336) showed differences between the control and SD conditions.
Decision option onset
Neural responses related to the onset of the decision option were examined (i.e., without parametric modulation) to address the question of whether neural responses were generally attenuated in the SD condition. We performed a conjunction analysis (Nichols et al., 2005) testing for regions showing significant task-related activation in both conditions. The most strongly activated regions (Fig. 4) were the lateral occipital cortex [MNI coordinates (x, y, z) = (40, −78, −10); Z value = 7.71; pFWE < 0.001], anterior cingulate cortex (ACC) [(−4, 18, 46); Z value = 6.79; pFWE < 0.001], and inferior frontal gyrus [(−50, 14, 32); Z value = 5.92; pFWE < 0.001].
We next examined differences in the task-evoked responses between the two conditions. Activity in a more inferior region of the ACC [(−10, 16, 30); Z value = 4.13] and the right anterior insula [(30, 20, −2); Z value = 3.41; Fig. 4] was greater during CON than SD. In contrast, a cluster in the rostral subgenual ACC/medial OFC (mOFC) showed stronger responses in SD than CON [(−4, 14, −12); Z value = 3.44; Fig. 4]. Note that these findings directly replicate recent observations of enhanced mOFC and attenuated insula responses during risky decision making after SD (Venkatraman et al., 2007, 2011).
Parametric analyses
Additional analyses focused on the neural representation of key parameters in the choice process: subjective value of the risky option, subjective value of the chosen option, and decision conflict. Measures were calculated for each trial, based on each participant's maximum-likelihood model parameters and included as parametric regressors in the fMRI analysis.
Value of the risky option and chosen value
The main effect for the subjective value of the risky option (across conditions) is listed in Table 1. Activity was observed in an extensive frontoparietal system, including bilateral superior parietal cortices, posterior cingulate, lateral PFC, as well as ACC and medial prefrontal regions. To illustrate the reproducibility of valuation effects in risky choice, Figure 5 displays the overlap of the neural subjective value correlation for the risky option from the control condition of the present study with the same contrast from our previous study (Peters and Büchel, 2009). Note the overlap in ventral striatum, SN/VTA, and frontoparietal regions.
A range of regions also showed a significantly greater correlation with subjective value in CON than SD (Table 1), including regions of the superior parietal cortex/precuneus, bilateral insulae, and, importantly, a large cluster in the midbrain, including the SN/VTA. Most of these regions, including the SN/VTA (Fig. 6), also survived a more conservative conjunction analysis (Nichols et al., 2005) testing for regions showing a positive correlation with subjective value in CON that was significantly attenuated after SD [both contrasts thresholded at p < 0.001, uncorrected (Table 1); psvc = 0.004]. A qualitatively similar reduction of valuation responses was observed in the right ventral striatum but did not survive correction for multiple comparisons across our a priori region of interest [(8, 16, −6); Z value = 2.87; psvc = 0.137].
In an additional analysis, we examined parametric responses to the subjective value of the chosen option (i.e., based on a separate first-level analysis; see Materials and Methods). Here, vmPFC [(−2, 44, −16); Z value = 3.44; psvc = 0.033] showed a positive correlation with the value of the chosen option in CON that was again significantly attenuated in SD (Fig. 6). (Note that, for reasons of brevity, additional effects for the chosen value regressor are not included in the present report.)
Decision conflict
The main effect of decision conflict was associated with activity in bilateral superior parietal cortex, left middle frontal gyrus, and ACC (Table 1). The SN/VTA region in the midbrain (Table 1) showed a significantly attenuated correlation with decision conflict in SD compared with CON. This region also survived a more conservative conjunction analysis testing for a positive correlation with conflict in CON that was significantly attenuated after SD (psvc = 0.035; Fig. 6).
In light of the negative correlation of the conflict regressor with chosen − unchosen value (Materials and Methods), we examined an additional model in which conflict was orthogonalized with respect to chosen − unchosen value. The same ACC region (Table 1) was still associated with decision conflict even when removing variance associated with chosen − unchosen value [(−6, 28, 42); Z value = 3.84; Fig. 7a]. Along similar lines, the same midbrain region still showed a conflict correlation in CON that was attenuated after SD [(−10, −14, −18); Z value = 3.25; Fig. 7b]. In contrast, chosen − unchosen value (orthogonalized with respect to conflict) was correlated with activity in posterior cingulate cortex [(4, −26, 30); Z value = 4.92; Fig. 7c] and vmPFC [(6, 40, −2); Z value = 4.48; Fig. 7c], replicating previous findings (FitzGerald et al., 2009).
Additional fMRI models
Midbrain regions showed both attenuated valuation and attenuated conflict responses after SD (Fig. 6). Because the two parameters are primarily orthogonal, these effects are unlikely to be related. Nonetheless, we formally examined the independence of these effects. Right SN/VTA showed a significant correlation with subjective value in CON that was reduced in SD, even when the value regressor was orthogonalized with respect to conflict [(4, −14, −12); Z value = 3.57; psvc = 0.021]. Likewise, left SN/VTA still showed a significant correlation with conflict in CON that was reduced after SD even when the conflict regressor was orthogonalized with respect to value [(−10, −16, −12); Z value = 3.58; psvc = 0.018].
Discussion
We used fMRI to examine changes in reward-based decision making after 24 h of SD, emphasizing the neural representation of core parameters of the decision process (value and decision conflict). We observed increased stochasticity in participants' responses after SD for two different models of risky choice. The typically observed response deceleration during high-conflict choices was attenuated after SD. This effect was reflected in an attenuated conflict sensitivity of the drift rate parameter of the RDM in SD (Ratcliff and Van Dongen, 2009). Neurally, we observed significantly attenuated valuation signals in the parietal cortex, SN/VTA, and vmPFC, regardless of the particular behavioral model that was used. Conflict responses in the SN/VTA showed a similar downregulation.
Previous findings regarding the behavioral effects of SD on decision making have been mixed: increases in risk-taking behavior after SD have been reported for the Iowa Gambling Task (IGT) (Killgore et al., 2006, 2007). A recent neuroimaging study involving gains and losses also revealed increased risk taking (Venkatraman et al., 2011), whereas a study that only involved gains did not (Venkatraman et al., 2007), suggesting that risk taking may be differentially affected depending on whether gains or losses are at stake (McKenna et al., 2007). Delay discounting and, as in the present study, probability discounting were not affected by SD (Acheson et al., 2007). However, previous studies did not use a model-based approach. For example, in the IGT, changes in stochastic responding may be misinterpreted as increases in risk taking, because both may increase selections from the high-risk deck. We observed increases in the stochasticity parameter of the soft max choice function after SD. Our results thus confirm and extend previous findings of increased stochastic variability in SD (Chee et al., 2008; Chee and Tan, 2010) to the domain of value-based decision making. Although we cannot rule out the possibility that this effect is partly attributable to a reduced adequacy of the models in the sleep-deprived state or affected by occasional eye closure during option processing, previous findings (Chee et al., 2008; Chee and Tan, 2010) are more consistent with the idea that increased stochastic variability may underlie these effects.
SD typically leads to a general slowing of RTs (Durmer and Dinges, 2005). In addition to this quantitative effect, our analysis of RT distributions revealed an additional qualitative effect of SD: the typically observed inverse U-shaped distribution of RTs (reflecting slowest responding for situations of highest conflict) was significantly attenuated after SD, an effect that, in the context of the RDM, was linked a reduced conflict sensitivity of the drift rate parameter. Similar to other cognitive domains (Ratcliff and Van Dongen, 2009), cognitive modeling can thus reveal specific SD-induced impairments in decision making that go beyond basic alterations in risk preferences or RT increases.
At the neural level, we observed attenuated representations of subjective value in the SN/VTA, bilateral insula, and parietal cortex regions. Chosen value signals in the vmPFC showed a similar attenuation. Reduced lateral and medial parietal cortex responses are frequently observed after SD (Chee and Choo, 2004; Choo et al., 2005; Chee et al., 2006; Lim et al., 2007; Tomasi et al., 2009), and our results extend these observations to parametric value representations. Overall task-related responses (i.e., without parametric modulation) recruited a similar neural system, including the ventral visual stream, lateral PFC, and ACC. Enhanced mOFC/vmPFC and attenuated insula activation were evident in the categorical analysis of the sleep-deprived state, displaying foci that are consistent with recent findings (Venkatraman et al., 2011). In contrast, parametric chosen value signals were downregulated in a more anterior subregion of the mOFC/vmPFC (Libedinsky et al., 2011), highlighting differences between parametric and categorical analyses and suggesting some potential heterogeneity between anterior and posterior OFC subregions.
The present data therefore complement previous studies on reward-related processing after SD (Venkatraman et al., 2007, 2011; Gujar et al., 2011; Libedinsky et al., 2011) by going beyond categorial analysis techniques. The power of model-based analyses is that neural representations of specific choice parameters can be examined (O'Doherty et al., 2007). Neural changes can thus be related to ongoing computational processes. In contrast, conventional fMRI analyses focus mainly on differences in the location of task-related responses (Venkatraman et al., 2007, 2011).
Using model-based fMRI, we observed significantly attenuated value signals in the SN/VTA, precuneus/posterior cingulate, insula, and vmPFC/mOFC (Peters and Büchel, 2010). Previous studies have observed value-related responses in the SN/VTA in humans (Knutson et al., 2005; O'Doherty et al., 2006; D'Ardenne et al., 2008; Peters and Büchel, 2009; Zaghloul et al., 2009), and an attenuated midbrain value signal is of particular interest in the context of SD. SN/VTA activity measured using fMRI predominantly reflects DA release (Schott et al., 2008; Düzel et al., 2009), and SD decreases [11C]raclopride binding in striatum and thalamus, which may reflect increases in DA levels and/or a downregulation of D2 receptors (Volkow et al., 2008, 2009). How may these observations fit together? SN/VTA neurons release DA in the ventral striatum through two distinct firing modes, tonic and phasic (Grace, 1991; Goto et al., 2007). Burst firing of SN/VTA neurons (e.g., during unexpected reward) leads to a transient (phasic) release of DA in the ventral striatum (Goto et al., 2007). Low-frequency (tonic) background firing of SN/VTA neurons, in contrast, controls the baseline DA level in the striatum and PFC (Goto et al., 2007). These firing modes interact, such that only tonically active cells are assumed to be able to enter burst firing mode (Grace, 1991). However, an excess of tonic DA may at the same time disrupt phasic DA responses (Bilder et al., 2004), which in turn may result in attenuated parametric (phasic) signals as observed in the present study. It should be kept in mind, however, that DA release was not directly measured, and therefore the direct link between alterations in DA availability and changes in midbrain responses remains to be established.
The observation of a robust attenuation of value responses for probabilistic monetary rewards in the SN/VTA may appear to disagree with two recent fMRI studies that suggested enhanced reactivity of brain reward networks after SD (Gujar et al., 2011; Venkatraman et al., 2011). In one study, subjects evaluated mixed gambles consisting of both gain and loss options. Here, SD was associated with elevated vmPFC and attenuated insula activity (Venkatraman et al., 2011), two effects that were also observed in the categorical comparison of SD and CON in the present study. Using model-based fMRI, we can extend this observation by showing that parametric value representations in a number of reward-related regions (including SN/VTA and vmPFC) are robustly downregulated in SD. Another study examined neural responses to pleasant and neutral images in a between-subjects design (Gujar et al., 2011). Participants in the SD group rated more pictures as positive, and this “positive rating bias” was accompanied by, among other regions, enhanced activity in the SN/VTA. However, a number of important differences in the experimental design (e.g., picture stimuli vs probabilistic monetary rewards, between vs within subjects design, and so forth) make a direct comparison difficult.
Interestingly, we also observed reduced parametric responses to decision conflict in the SN/VTA. This finding complements a number of recent observations of SN/VTA responses scaling with trial-by-trial changes in task load (Boehler et al., 2011a) and cognitive control demands (Boehler et al., 2011b). In conjunction with the effects of subjective value discussed above, our observations are therefore compatible with a dual role of SN/VTA responses, signaling both motivational value and motivational salience (e.g., whether a stimulus requires allocation of cognitive resources, as during high-conflict trials) (Bromberg-Martin et al., 2010).
It could be argued that changes in parametric responses to value and conflict may simply be attributable to an overall reduced behavioral and neural engagement of subjects when sleep deprived. However, behavioral data indicate that, although subjects were generally slower and missed slightly more trials during SD, they responded within the 2 s response window in >98% of all trials. Also, although participants' choices were more stochastic after SD, their choices were nonetheless highly consistent with their behavior in the control condition, and the analysis of nonparametric task-related responses argues against a general attenuation of neural responses after SD.
We acknowledge some caveats of our SD intervention. First, a potential confound for group 2 (order SD → CON) is that a single night may not have been sufficient for full recovery after SD (but see Rosa et al., 1983). However, recovery sleep was long (>10 h on average) and performance on the continuous performance task suggested normal alertness after the recovery night. Second, scanning did not always take place at the same time of day (group 1, CON evening, SD morning; group 2, CON + SD in the morning), and circadian effects were thus different in the two groups. A related point is that daytime activities between groups before the CON scan likely differed. We cannot rule out that the last two points may have affected behavioral and neural differences between CON and SD.
Together, SD increased choice stochasticity and attenuated conflict-dependent slowing of RTs (decreased conflict sensitivity of the drift rate in the RDM). At the neural level, we observed attenuated value and conflict signals in the SN/VTA (Bromberg-Martin et al., 2010) and vmPFC (Libedinsky et al., 2011). Although the involvement of DA was not directly examined, our findings would be compatible with the view that SD increases DA levels and/or downregulates D2 receptors (Volkow et al., 2008) to a degree that interferes with phasic responses (Bilder et al., 2004), thereby attenuating the neural coding of crucial choice parameters.
Footnotes
This work was supported by Deutsche Forschungsgemeinschaft Sonderforschungsbereich 654 Project A12 (M.M.M. and C.B.) and PE 1627/3-1 (J.P.). We thank Sebastian Gluth and Nico Bunzeck for helpful discussions.
- Correspondence should be addressed to Dr. Jan Peters, NeuroimageNord, Department for Systems Neuroscience, Martinistraße 52, D-20246 Hamburg, Germany. j.peters{at}uke.uni-hamburg.de