Abstract
Disparities in socioeconomic status (SES) lead to unequal access to financial and social support. These disparities are believed to influence reward sensitivity, which in turn are hypothesized to shape how individuals respond to and pursue rewarding experiences. However, surprisingly little is known about how SES shapes reward sensitivity in adolescence. Here, we investigated how SES influenced adolescent responses to reward, both in behavior and the striatum—a brain region that is highly sensitive to reward. We examined responses to both immediate reward (tracked by phasic dopamine) and average reward rate fluctuations (tracked by tonic dopamine) as these distinct signals independently shape learning and motivation. Adolescents (n = 114; 12–14 years; 58 female) performed a gambling task during functional magnetic resonance imaging. We manipulated trial-by-trial reward and loss outcomes, leading to fluctuations between periods of reward scarcity and abundance. We found that a higher reward rate hastened behavioral responses, and increased guess switching, consistent with the idea that reward abundance increases response vigor and exploration. Moreover, immediate reward reinforced previously rewarding decisions (win–stay, lose–switch) and slowed responses (postreward pausing), particularly when rewards were scarce. Notably, lower-SES adolescents slowed down less after rare rewards than higher-SES adolescents. In the brain, striatal activations covaried with the average reward rate across time and showed greater activations during rewarding blocks. However, these striatal effects were diminished in lower-SES adolescents. These findings show that the striatum tracks reward rate fluctuations, which shape decisions and motivation. Moreover, lower SES appears to attenuate reward-driven behavioral and brain responses.
Significance Statement
Lower socioeconomic status (SES) is associated with reduced access to resources and opportunities. Such disparities may shape reward sensitivity, which in turn could influence how individuals respond to and pursue rewarding experiences. Here, we show that lower-SES adolescents display reduced reward sensitivity in the brain and behavior. The striatum—a brain region that is highly sensitive to reward—showed greater activations during periods of high reward and tracked fluctuations between reward-rich and reward-scarce task phases. However, lower SES correlated with smaller reward-driven striatal responses and reduced response slowing after rare rewards. These findings link lower SES to reduced reward responses, which could trigger a cycle of reduced reward pursuit, leading to fewer positive experiences, which could further diminish reward sensitivity.
Introduction
Adolescents from lower-SES backgrounds have less access to enriching opportunities and resources than their higher-SES peers (Farah, 2017). These disparities may influence reward sensitivity, which in turn, could shape how adolescents respond to or pursue rewarding experiences (Amir et al., 2018). Such a cycle could explain how SES—by modulating reward responses and related processes—is associated with many consequential developmental outcomes (Farah, 2017). Here, we examined how SES relates to reward-driven responses in behavior and the brain in adolescents, focusing on the striatum because of its high sensitivity to reward (Schultz, 1993).
Rewards powerfully influence motivation, learning, and decision-making. Immediately rewarding outcomes, signaled by fast phasic striatal responses, are thought to serve as a learning signal to maximize rewards (Day et al., 2007). Rewarding outcomes strongly reinforce prior actions that led to rewards (Hamid et al., 2016) and induce “postreward pausing” in behavior (Schlinger et al., 2008). Individuals are also sensitive to the overall amount of reward available in their environment. The average environmental reward rate (tracked by tonic dopamine and estimated from past reward history) influences moment-to-moment shifts in response time and exploration (Niv et al., 2007; Hamid et al., 2016; Wang et al., 2021). A high environmental reward rate boosts response speeding, in theory, by increasing the cost of time (slower responses forfeit more rewards; Niv et al., 2006, 2007; Beierholm et al., 2013; Wang et al., 2013, 2021; Otto and Daw, 2019) and increases exploration, in theory, due to the high likelihood of attaining rewards in the environment (Niv et al., 2007; Constantino and Daw, 2015; Sukumar et al., 2023). Interestingly, these distinct reward signals also interact: reward scarcity heightens sensitivity to immediate reward, amplifying both phasic dopamine firing following rewards (Bayer and Glimcher, 2005; Hamid et al., 2016) and behavioral pausing after rewarding outcomes (Schlinger et al., 2008).
How SES influences responses to these distinct reward signals in adolescents in the brain and behavior remains unclear. Previous research suggests that lower SES may increase sensitivity to immediate reward, as lower-SES individuals tend to choose small immediate rewards over larger, delayed ones (Oshri et al., 2019). This is hypothesized to adaptively enable individuals to quickly seize scarce reward opportunities to meet basic needs (Frankenhuis et al., 2016; Pepper and Nettle, 2017; Frankenhuis and Nettle, 2020). Lower-SES environments can also be less predictable (Evans, 2004), meaning past reward history may poorly predict future outcomes (Ross and Hill, 2002; Behrens et al., 2007). Based on this research, lower-SES adolescents may be highly responsive to immediate reward but less responsive to past reward history, which could lead to contextually suboptimal behavior.
This hypothesis, however, contrasts with two studies that found that lower SES in adolescents correlated with reduced responses to rewarding cues in the parietal (White et al., 2022) and frontal (Palacios-Barrios et al., 2021) cortices. Notably, however, both studies linked lower SES to poorer behavioral learning of cue–reward associations (Palacios-Barrios et al., 2021; statistical trend, White et al., 2022), which may have altered expectations of reward when viewing reward-predicting cues. The present study therefore eliminated learning demands.
In the present study, we examined behavioral and striatal responses to reward and reward rate fluctuations in adolescents from diverse SES backgrounds. Adolescents performed a gambling task during functional magnetic resonance imaging in which they won or lost on each trial. Unbeknownst to participants, we manipulated trial outcomes, leading to alternating periods of reward scarcity and abundance. We examined how immediate reward and average reward rate fluctuations shaped vigor [response times (RTs)] and choices differently by SES. We also examined SES-related differences in the influence of reward and average reward rate fluctuations on striatal responses. Our results support influential theories of decision that argue the striatum tracks average reward rate fluctuations, as well as theories that suggest that lower SES reduces behavioral and striatal reward sensitivity.
Methods
Participants
We recruited 127 adolescents from diverse SES backgrounds as part of a larger project examining the relationship between SES, brain development, and cognition. Eligible participants were in the seventh or eighth grade, were proficient in English, had no MRI contraindications, were not diagnosed with autism or a neurological disability, and were not born premature (<34 weeks). Thirteen children did not complete the MRI, resulting in a sample of 114 adolescents [age range = 12–14; mean (SD) = 13.46 (0.68), n = 56 female]. Five participants with excessive movement during scanning [average framewise displacement (FD) of more than 0.6 mm] were retained only for behavioral analyses, leaving 109 for the neuroimaging analysis (correlation between FD and SES among the included participants: β = 0.005, SE = 0.01, t(107) = 0.35, p = 0.730, r = 0.03). Of note, the findings remained unchanged with a more conservative limit of movement (average FD of <0.3 mm). All children and their legal guardians provided assent and consent. The study was approved by the MIT Committee on the Use of Human Subjects. Participants received compensation for their time.
Before collecting data, we targeted a sample of at least 100 participants based on studies reporting medium-to-large effects (i.e., Cohen's d of 0.5–0.8) on the relationship between SES and cognitive performance (Noble et al., 2007; Finn et al., 2017; Leonard et al., 2019), brain structure (Romeo et al., 2018; A. L. Decker et al., 2020), and brain function (Finn et al., 2017). A sensitivity analysis revealed that our sample size provided 80% power to detect medium effects (d of 0.53 or Pearson's r of 0.25) in two-tailed between-subject analyses.
Measure of SES
Participants’ caregivers reported their annual household income (range = $2k-$1.25 m) and the number of years of schooling they had completed (range = 7–20 years). Our primary measure of SES incorporated both of these variables. We averaged the z-scores of maternal education, paternal education, and log-transformed income measures (Fig. 1A depicts the SES distribution). The log transformation on income accounts for the greater impact that gains have for lower-SES individuals. Two participants were missing one of the three measures, so their SES index was the average of the two others.
Experimental design
Participants performed a variant of the Delgado et al. card-guessing task (Delgado et al., 2000; Hubbard et al., 2020a,b; Fig. 1B). On each trial, adolescents guessed if an upcoming number, with a possible value from 1 to 9, would be larger or smaller than 5. They then received immediate feedback based on the accuracy of their guess. Participants were told that accurate guesses would be financially rewarded as wins, inaccurate guesses would be financially punished as losses, and the sum of wins and losses would be calculated for an additional payment. Unbeknownst to participants, trial-by-trial gains and losses were predetermined and fixed across trials, with numbers generated to match the predetermined outcome for each trial. Outcomes were therefore unrelated to participant guesses, which equalized uncertainty across participants, and ensured everyone had the same experience of winning and losing.
Each trial began with a question mark, during which participants had 1.5 s to register a guess (smaller than 5 = index finger; larger than 5 = middle finger; Fig. 1B). A number was then displayed for 500 ms, followed by 500 ms of feedback. Feedback indicated whether participants had won or lost money or neither won nor lost money. Positive feedback, which followed correct guesses, consisted of a green arrow pointing up and the text “+$1”; negative feedback, which followed incorrect guesses, consisted of a red arrow pointing down and text displaying “−$0.5”; neutral feedback, which followed the number 5, consisted of a light green double-sided arrow. If participants did not register a guess, they received neutral feedback. This happened rarely (3.1 trials or 4.5% of trials on average per participant; the relationship between missed responses and SES: β = 0.27, SE = 0.29, t(114) = 0.95, p = 0.35). Participants viewed a fixation cross for 1 s before a new trial began.
The task, in total, across both runs, consisted of eight blocks of eight trials each, with four blocks of mostly positive outcomes (“reward blocks”) and four blocks of mostly negative outcomes (“loss blocks”). Each of the two runs contained two reward and two loss blocks, and each block was approximately 28 s. This block design maximized the ability to detect striatal responses to reward, while also leading to alternating periods of monetary reward scarcity and abundance, allowing us to examine the influence of fluctuations in average reward rate across time (Fig. 1D). To keep participants unaware of the fixed outcomes, there was no delay between blocks, and blocks contained a few trials of the opposite type (Fig. 1C depicts the trial outcomes in a representative reward and loss block). Reward blocks included six reward trials interleaved with two of either loss or neutral trials. Loss blocks included six loss trials interleaved with two of either reward or neutral trials. All participants received $10 in bonus money after the task.
Image acquisition
Participants practiced the gambling task and completed a mock scanning session to acclimate to the MRI environment, which improves compliance (de Bie et al., 2010; Gao et al., 2023). They then completed two runs of the gambling task inside the scanner and watched a movie while we acquired a T1-weighted (T1w) anatomical scan. Images were acquired using a 3T Siemens Prisma Fit scanner with a 32-channel head coil. Whole-brain functional BOLD images were acquired using an EPI sequence (TR = 0.8 s, TE = 37 s, flip angle = 52°, voxel size = 2 mm isotropic, multiband factor = 8). The two runs were acquired with reversed phase encoding to support distortion correction. High-resolution T1w images were acquired with an MP-RAGE sequence (TR = 2.4 s, T = 2.18 ms, flip angle = 8°, voxel size = 0.8 mm isotropic).
Image preprocessing
Preprocessing of anatomical and functional data was performed using fMRIPrep version 22.1.1 (Esteban et al., 2019).
Anatomical preprocessing
The anatomical T1w image was corrected for intensity nonuniformity with N4BiasFieldCorrection (Tustison et al., 2010) distributed with ANTs 2.3.3 (Avants et al., 2008) and used as a T1w reference throughout the workflow. The T1w reference was then skull-stripped using ANTs workflow with OASIS30ANTs as the target template. Brain tissue segmentation of gray matter, white matter, and cerebrospinal fluid was performed on the brain-extracted T1w using fast (FSL 6.0.5.1:57b01774; RRID, SCR_002823; Zhang et al., 2001). Brain surfaces were reconstructed using recon-all from FreeSurfer version 7.2.0 (Dale et al., 1999), and the brain mask estimated previously was refined with a custom variation of the method to reconcile ANTs-derived and FreeSurfer-derived segmentation of subcortical gray matter including striatal subregions (Fischl et al., 2002). Volume-based spatial normalization to one standard space was performed through nonlinear registration, using brain-extracted versions of both the T1w reference and T1w template. FSL's MNI ICBM 152 nonlinear 6th Generation Asymmetric Average Brain Stereotaxic Registration Model (Evans et al., 2012; RRID, SCR_002823; TemplateFlow ID, MNI152NLin6Asym) was selected for spatial normalization.
Functional preprocessing
A skull-stripped reference volume was generated using a custom methodology of fMRIPrep. Head motion parameters were estimated using mcflirt (FSL 6.0.5.1:57b01774; Jenkinson et al., 2002). The estimated fieldmap was aligned with rigid-body registration to the target EPI reference run. Field coefficients were mapped onto the reference EPI using the rigid-body transform. BOLD runs were slice time corrected using 3dTshift from AFNI (Cox and Hyde, 1997; RRID, SCR_005927). The BOLD reference images were coregistered to the T1w reference using bbregister (FreeSurfer; Greve and Fischl, 2009), with six degrees of freedom. Noise regressors were estimated based on the preprocessed BOLD. An FD was computed using two formulations following Power et al. (2014) and Jenkinson et al. (2002). Physiological regressors were extracted from eroded cerebrospinal fluid and white matter volumes for use in subsequent, component-based noise corrections (CompCor; Behzadi et al., 2007). The BOLD time series were resampled into standard space in a single interpolation step by composing all the pertinent transformations (i.e., head motion transform matrices, susceptibility distortion correction, and coregistrations to anatomical and output spaces). Volumetric resamplings were performed using ANTs, configured with Lanczos interpolation to minimize the smoothing effects of other kernels (Lanczos, 1964).
Statistical analyses
Statistical analyses were conducted in R (version 4.2.2). Raw data, code, and extended analyses and supplementary tables are available at the following link: https://osf.io/pqtby/. Unless stated otherwise, linear mixed-effects regressions or general linear mixed-effects regressions were employed for data that repeated within participants (e.g., single-trial RTs). Mixed-effects models included random intercepts for each participant and random slopes for fixed effects that repeated within participants. In the case of nonconverging models, we followed the recommendations by Brown (2021), iterating through the following until they converged: (1) using the “bobyqa” optimizer, (2) increasing the number of iterations, (3) forcing zero correlations among random effects, and (4) dropping random effects based on model comparison. RTs that fell three absolute deviations from an individual's median RT were excluded (n = 2 on average per participant). Measures were mean centered within or across participants or effect-coded prior to model fitting.
Calculating trial-by-trial shifts in the moving average reward rate
We computed an exponentially weighted moving average (EWMA) of rewards and losses across trials ($1, $0.5, or $0; see Fig. 1C,D for a schematic). Each trial was assigned a value based on the recent reward and loss history. High values indicated more gains than losses, whereas low values indicated more losses than gains. We used an exponentially weighted (rather than simple) moving average to emphasize recent time points, which have a larger impact on psychological state, while still incorporating data points from farther in the past (Awheda and Schwartz, 2016). We used the following update rule:
As an exploratory analysis, we also tested whether individual differences in optimal learning rates for the average reward rate variable differed by SES. To do so, we fit a model that estimated the learning rate as a free parameter for each participant using R's base optim function with the L-BFGS-B algorithm. The algorithm identified the learning rate per participant that minimized that residual sum of squares (RSS) in a model predicting subsequent RTs from the EWMA of reward for each participant.
Characterizing behavioral responses to rewards
We examined how immediate feedback (win vs loss outcomes) and fluctuations in the average reward rate shaped RTs and guesses. We first fit a model predicting RTs from the preceding trial's feedback (win, loss), the moving average of reward, and their interaction. We then refit this model after adding SES as a covariate and interaction term. We also examined the influence of immediate reward and average reward rate fluctuations on choices—specifically, how likely an individual was to repeat their prior guess or switch to a different guess (i.e., switched or stayed). Therefore, the dependent variable was whether an individual had repeated their prior choice (switched = 1; stayed = 0), and the independent variables were the preceding trial feedback (win, loss), the moving average of reward, and their interaction. We refit this model after adding SES as a covariate and interaction term. All models included trial numbers as a covariate to control for the general effects on time on task. Since there were only eight neutral trials per participant across the task, trials that followed neutral feedback were excluded from the analysis.
The relationship between SES and striatal volumes
Three linear mixed-effects models were fit to examine the association between SES and ROI volumes, separately for the caudate, putamen, and nucleus accumbens. Each model predicted volume from SES, hemisphere, and their interaction, to determine whether the influence of SES was stronger for one particular hemisphere. Age, sex, and intracranial volume were also included as covariates. ROIs with volumes that fell >3 absolute deviations from the sample median were excluded (all regions for one participant and the caudate and right nucleus accumbens for another).
Examining reward-driven striatal responses to reward and average reward rate fluctuations across time
To ascertain if striatal activations during reward differed from loss blocks and to examine their covariance with average reward rate fluctuations, we conducted neuroimaging analyses with Nilearn. The scripts and data are publicly accessible (https://osf.io/pqtby/). The approach involved two separate general linear models (GLMs) applied to participant data within the MNI coordinate space. The first model had distinct regressors for reward and loss conditions. The second model incorporated a regressor for the EWMA of reward, resampled at the fMRI's TR. Both models were convolved with SPM's hemodynamic response function and controlled for head movement and noise components (three translation and rotation parameters, plus the top five principal aCompCor components, defined in a combined white matter and cerebrospinal fluid mask). This analysis yielded z-value effect size maps for each subject. The maps were entered into a group-level analysis to identify striatal voxels that were sensitive to the distinct reward versus loss blocks or to the average reward rate. Sensitivity was defined by voxel significance within the anatomical striatal mask from the Harvard–Oxford Atlas (FDR-corrected p < 0.05, minimum cluster size of 10). For each analysis, we calculated the mean z-value per participant across responsive voxels, separately for the caudate, putamen, and nucleus accumbens in each hemisphere. Participants therefore had six z-values (one per ROI) for each analysis. These values represented the average effect size for the differences in activations between reward and loss blocks and the relationship with the average reward rate.
To assess the degree to which these effect sizes deviated from zero, we fit two intercept-only linear mixed-effects models, predicting mean z-values per ROI, controlling for age and sex, with random intercepts per participant to account for repeated measures across hemispheres. We excluded outlier values that fell three absolute deviations from the sample's median (one value for the left putamen and one for the left caudate). Including outliers did not change the pattern of results.
Examining how reward-driven striatal responses differ by SES
Finally, we tested how SES related to activation level differences between reward and loss blocks, as well as the degree to which striatal activations covaried with fluctuations in the average reward rate. To this end, we fit two linear mixed-effects models. The dependent variables were z-values reflecting either activation level differences for reward and loss blocks or the covariance between striatal activations and average reward rate fluctuations. Both models included SES, hemisphere, and their interaction as independent variables and covariates for age and sex.
Results
We first describe how behavioral responses, specifically RTs and choices, are influenced by immediately rewarding outcomes and covary with fluctuations in the average reward rate across time. We then describe how these behavioral responses differ by SES. Turning to the neuroimaging data, we then explore the association between SES and the volume of the putamen, caudate, and nucleus accumbens. Furthermore, we examine differences in striatal activations during reward versus loss blocks and examine how these activations covary with temporal fluctuations in the average reward rate. Finally, we focus on disparities in striatal responses across SES.
Average reward rate fluctuations influence RTs and postreward pausing
Adolescents responded more slowly after winning than losing (i.e., postreward pausing: β = 0.02, SE = 0.005, t(262) = 4.68, p < 0.001; Fig. 2A). Furthermore, trial-by-trial RTs covaried with fluctuations in the average reward rate, such that a higher average reward rate led to faster RTs (β = −0.04, SE = 0.02, t(98) = −2.32, p = 0.022). Fluctuations in the average reward rate also interacted with immediate feedback to shape RTs: periods of reward scarcity amplified postreward pausing (reward rate × preceeding feedback: β = −0.05, SE = 0.02, t(104) = −3.53, p < 0.001, Fig. 2B), indicating responses to immediate reward were amplified by a history of low rewards. In fact, postreward pausing was only observed when rewards were scarce but not when they were plentiful (effect of preceding feedback when the reward rate is centered at −1 SD below the mean: β = 0.04, SE = 0.007, t(100) = 6.23, p < 0.001. Above the mean: β = 0.003, SE = 0.008, t(104) = 0.43, p = 0.665). These findings show that adolescents tracked fluctuations in the average reward rate, which shaped RTs across time and modulated sensitivity and responses to immediate reward.
Average reward rate fluctuations influence guess switching
Immediate feedback reinforced decisions on subsequent trials: when adolescents won, they were more likely to repeat their prior guess on the subsequent trial than if they had lost (β = −0.31, SE = 0.04, z = −7.31, p < 0.001; Fig. 2C). A lower average reward rate also increased the likelihood of repeating a previously rewarded guess (i.e., increased win–stay, lose–switch effects (reward rate × preceding feedback: β = 0.57, SE = 0.12, z = 4.73, p < 0.001; Fig. 2D). Indeed, win–stay effects were most prominent when the average reward rate was low, indicating a history of low rewards increased the tendency to stick with a rare rewarding option (main effect of immediate feedback on choices when the average reward rate is centered at −1SD below the mean: β = −0.51, SE = 0.05, z = −9.64, p < 0.001. Above the mean: β = −0.11, SE = 0.06, z = −1.75, p = 0.080). In general, a history of high rewards (a higher average reward rate) also increased the likelihood of switching guesses across trials (β = 0.68, SE = 0.15, z = 4.46, p < 0.001), suggesting a greater tendency to make alternative exploratory decisions when rewards were abundant. These findings suggest that a history of low reward increases the tendency to stick with a previously rewarding option and reduces the tendency to explore alternatives for reward.
Reward rate fluctuations influence postreward pausing more in higher-SES adolescents
Immediate reward and average reward rate fluctuations influenced choices similarly regardless of SES (SES × feedback, β = 0.09, SE = 0.05, z = 1.73, p = 0.084; SES × average reward rate, β = −0.25, SE = 0.18, z = −1.40, p = 0.161; SES × feedback type x moving average, β = 0.18, SE = 0.15, z = 1.21, p = 0.225; Fig. 3A). Additionally, these distinct temporal dimensions of reward influenced RTs similarly, regardless of SES (SES × feedback: β = −0.002, SE = 0.005, t(109) = −0.45, p = 0.651; SES × average reward rate: β = 0.02, SE = 0.02, t(100) = 1.04, p = 0.301; Fig. 3A).
However, reward rate fluctuations modulated postreward pausing more in higher- than lower-SES adolescents (SES × feedback type x moving average: β = −0.04, SE = 0.02, t(105) = −2.54, p = 0.013; Fig. 3B). That is, higher-SES adolescents slowed more following rare rewards (main effect of SES when the reward rate is centered at −1SD below the mean to reflect reward scarcity: β = 0.02, SE = 0.007, t(684) = 2.20, p = 0.028; Fig. 3B). When rewards were plentiful, higher-SES adolescents slowed less following rewards than lower-SES adolescents (centered at +1SD above the mean to reflect reward abundance: β = −0.02, SE = 0.008, t(4,949) = −2.05, p = 0.041; Fig. 3B) though neither group showed significant evidence of postreward pausing when rewards were plentiful (p’s > 0.087). Interestingly, SES was unrelated to individual differences in optimal learning rates (β = 0.06, SE = 0.04, t(114) = 1.32, p = 0.189), suggesting that heightened postreward pausing was not driven by a greater tendency to update expectations in response to new information. These findings suggest that adolescents from lower-SES backgrounds were less likely to adapt responses to immediate reward based on average reward rate fluctuations. Analyses reported in our extended analyses on the Open Science Framework (https://osf.io/9vhtw) demonstrate these results are robust when using education and income to separately characterize SES.
Lower SES correlates with smaller caudate volumes
Higher SES was associated with larger caudate volumes (β = 96.61, SE = 37.65, t(103) = 2.57, p = 0.012; Fig. 4). In contrast, there were no significant associations between SES and the volumes of the putamen (β = 32.78, SE = 49.74, t(103) = 0.66, p = 0.511) or nucleus accumbens (β = 1.94, SE = 7.26, t(104) = 0.27, p = 0.790). Moreover, there were no SES × hemisphere interactions in any ROI (all p’s > 0.590), demonstrating that SES-related differences in volumes did not differ by hemisphere.
The striatum tracks fluctuations in the average reward rate
Across adolescents, mean activations were larger during reward than loss blocks in the caudate (β = 0.50, SE = 0.07, t(106) = 6.74, p < 0.001), putamen (β = 0.61, SE = 0.08, t(106) = 7.83, p < 0.001), and nucleus accumbens (β = 0.77, SE = 0.08, t(107) = 9.87, p < 0.001; Fig. 5A). Furthermore, striatal activations covaried with the average reward rate, such that a higher average reward rate led to greater activations in the caudate (β = 0.77, SE = 0.08, t(105) = 9.60, p < 0.001), putamen (β = 0.66, SE = 0.07, t(105) = 8.94, p < 0.001), and nucleus accumbens (β = 1.32, SE = 0.09, t(103) = 14.07, p < 0.001; Fig. 5B). These findings show that the striatum not only responds more to reward than loss in general but tracks moment-by-moment shifts in the average reward rate across time.
Lower SES correlates with reduced striatal responses to reward
Lower SES correlated with smaller activation level differences between reward and loss blocks in the caudate (β = 0.22, SE = 0.09, t(105) = 2.54, p = 0.013) and putamen (β = 0.25, SE = 0.09, t(104) = 2.73, p = 0.007) and marginally in the nucleus accumbens (marginal effect: β = 0.16, SE = 0.09, t(106) = 1.79, p = 0.077; Fig. 6A). None of these effects differed by hemisphere (SES × hemisphere: all ps > 0.29). Furthermore, striatal activations covaried with average reward rate fluctuations more strongly in higher-SES adolescents in the putamen (β = 0.17, SE = 0.09, t(104) = 2.02, p = 0.046; Fig. 6B), but not the caudate (β = 0.08, SE = 0.10, t(104) = 0.88, p = 0.380) or nucleus accumbens (β = 0.02, SE = 0.11, t(101) = 0.18, p = 0.860). None of these effects differed by hemisphere (SES × hemisphere: all p’s > 0.21). Of note, the relationship between SES and reward-driven activations also did not differ by striatal subregion (SES × subregion interaction: all p’s > 0.10).
Discussion
We asked how SES in adolescence was related to reward-driven responses in the brain and behavior. Drawing on influential models of decision-making (Niv et al., 2006, 2007; Constantino and Daw, 2015), we examined how choices, RTs, and striatal activations were shaped by immediate reward outcomes and previous reward history (average reward rate fluctuations across time). We found that, behaviorally, participants were more likely to repeat a guess if it had led to a win (win–stay, lose–switch effects) and responded more slowly after receiving a reward (postreward pausing). Fluctuations in the average reward rate also shaped behavior: a higher reward rate hastened RTs and increased guess switching. Moreover, a low reward rate increased behavioral sensitivity to immediately rewarding outcomes; augmenting win–stay, lose–switch effects; and postreward pausing. Notably, compared to higher-SES adolescents, lower-SES adolescents exhibited reduced postreward pausing when rewards were scarce. We also observed that across participants, striatal activations were larger during reward than loss blocks and covaried with fluctuations in the average reward rate across time. However, relative to higher-SES adolescents, lower-SES adolescents displayed reduced activations during reward relative to loss blocks in the caudate and putamen and marginally in the nucleus accumbens. In addition, putamen activations tracked average reward rate fluctuations less in lower-SES adolescents. These findings show that the striatum tracks average reward rate fluctuations, which shape choices and RTs (Niv et al., 2006, 2007; Wang et al., 2013, 2021; Hamid et al., 2016). They also link lower SES in adolescence to reduced reward sensitivity, both in the brain and behavior.
We found that adolescents tracked fluctuations in the average reward rate across time, which influenced decisions and RTs. When rewards were abundant, individuals were more likely to switch choices across trials. These findings align with studies in human adults (Niv et al., 2007; Constantino and Daw, 2015; Sukumar et al., 2023) and support theories of decision-making (Constantino and Daw, 2015; Sukumar et al., 2023). These theories argue that when the average environmental reward rate is lower than an option's perceived value, it is rational to “stay” with a rewarding option due to the limited prospects of finding rewards elsewhere. Conversely, when the environmental reward rate is higher than the perceived value of an option, it makes sense to switch to exploring alternative sources of reward. It is possible, then, that adolescents used the average reward rate as a threshold for whether to switch or stay with a previous choice. Future research could examine how the tendency to track average reward rate fluctuations develops—and whether adolescents—given their heightened sensitivity to reward (Galvan et al., 2006; Cohen et al., 2010; Galvan, 2010; Davidow et al., 2016) might be even more attuned to fluctuations in the average reward rate across time than adults.
A higher average reward rate also covaried with faster RTs. This finding is consistent with research in human adults (Beierholm et al., 2013; Otto and Daw, 2019) and supports theories arguing that fluctuations in the average reward rate shape the cost time (Niv et al., 2006, 2007). That is, when rewards are abundant, action delays are presumably more costly because one forfeits relatively more potential rewards, incentivizing faster responses. Interestingly, other researchers have theorized that rewards also govern the opportunity cost engaging effort and sustaining attention (Kurzban et al., 2013; Esterman et al., 2016; Massar et al., 2016; Esterman and Rothlein, 2019; Otto and Daw, 2019; Lin et al., 2022) raising the possibility that average reward rate fluctuations shape diverse aspects of cognition–such as fluctuations in attention (Decker and Duncan, 2020; A. Decker et al., 2023; AL. Decker et al., 2023). Our findings therefore not only support theories linking reward rate fluctuations to motivation and decision-making and extend these ideas to human adolescents but raise questions about the influence of reward rate fluctuations on other aspects of cognition.
Adolescents were also responsive to immediately rewarding outcomes, in line with previous research (Reynolds et al., 2001; Hamid et al., 2016): they were most likely to repeat a previous choice if it had led to a reward on the prior trial and responded more slowly after a reward outcome, a phenomenon known as “postreward pausing” (Felton and Lyon, 1966; Crossman, 1968; McMillan, 1971; Wallace and Mulder, 1973; Schlinger et al., 2008; Williams et al., 2011). Notably, these effects were amplified by a lower average reward rate. Our finding adds to a growing body of research suggesting the background average reward rate modulates sensitivity to immediate reward. Indeed, in animals and humans, postreward pausing is prolonged when rewards are scarce (Schlinger et al., 2008). Furthermore, fewer recent rewards and lower tonic dopamine amplify phasic dopamine firing (Hamid et al., 2016)—a finding that potentially provides a neurobiological explanation for the increased reward responsivity we observed here when rewards were scarce. Slower responses after unexpected reward could also reflect surprise due to the infrequency of the event (A. Decker et al., 2020) or heightened response caution that facilitated more deliberate decision-making (Schlinger et al., 2008, p. 50). Altogether, this finding shows that average reward rate fluctuations influenced responses to immediate outcomes, which shaped choices and RTs. When adolescents tune into the average environmental reward rate, they may make more adaptive decisions according to the overall rewards available in the environment.
We also observed that the extent of RT slowing after rare rewards varied by SES. Adolescents from higher-SES backgrounds showed greater postreward pausing than lower-SES adolescents when rewards were scarce. This finding could reflect greater attunement to reward rate fluctuations among higher-SES adolescents, which would be expected to increase the saliency of receiving a rare reward when the reward rate was low. However, exploratory analyses showed that SES did not correlate with learning rates—the tendency to update the average reward rate in response to new outcomes. Thus, greater postreward pausing may instead reflect a greater responsivity to rewards in reward-scarce contexts specifically, rather than a general tendency to more readily update the average reward rate.
Interestingly, reward rate fluctuations covaried with striatal activations in the caudate, putamen, and nucleus accumbens, such that a higher reward rate led to greater activations in these regions. These findings are consistent with animals studies showing that tonic dopamine fluctuations in the striatum track the average reward rate and in doing so shape motivational vigor and decision-making (Wang et al., 2013, 2021; Hamid et al., 2016), and, as far as we know, this is the first human fMRI study demonstrating this relationship.
Our results extend prior findings linking lower SES to diminished reward sensitivity in neocortical regions like the anterior cingulate cortex (Palacios-Barrios et al., 2021) and parietal cortex (White et al., 2022). Indeed, we observed that the extent of reward-driven activations in the striatum differed by SES. Higher-SES adolescents showed greater reward-driven activations than lower-SES adolescents in the putamen, caudate, and marginally in the nucleus accumbens. Moreover, putamen activations tracked average reward rate fluctuations less in lower-SES adolescents. Notably, prior studies employed incremental learning tasks in which adolescents learned the value of cues in predicting outcomes over time. Our focus on a reward task that did not involve learnable cue–outcome contingencies broadens the literature by showing that reduced reward sensitivity is even observed when eliminating learning demands.
Our findings support proposals that lower-SES environments reduce reward sensitivity (Seligman, 1972). Past literature suggests that chronic stress diminishes the belief that actions have consequences rendering individuals less motivated to pursue rewarding outcomes (Seligman, 1972). It is therefore possible that chronic stress and reduced perceived control, which are more common among lower-SES individuals (Hackman and Farah, 2009; Hackman et al., 2010; McLaughlin et al., 2014; Farah, 2018) mediated the effects we observed here. Targeted research that employs direct measures of stress could directly test this mechanism.
The present findings offer insights into why cognitive performance (Noble et al., 2007) and emotional well-being (Reiss, 2013) are often reduced in lower-SES adolescents. Reward sensitivity plays a vital role in many aspects of cognition, influencing everything from the ability to learn important associations (Davidow et al., 2016) to the ability to remain attentive to important events (Shenhav et al., 2013; Esterman and Rothlein, 2019). Rewards boost motivation (Schultz, 1993; Westbrook and Braver, 2016; Frömer et al., 2021; Westbrook et al., 2021) and support success in short and long-term endeavors, such as academic and workplace pursuits. Disparities in reward sensitivity, therefore, may contribute to disparities in learning, attentional performance, and motivation. Given the intimate link between reward sensitivity and emotional well-being, reduced reward sensitivity may contribute to higher rates of depression (Reiss, 2013; Auerbach et al., 2022) and lower life satisfaction observed in lower-SES groups (Kahneman and Deaton, 2010). On a broader level, these insights stress the importance of socioeconomic policies (Farah, 2018) aimed at reducing the burdens of poverty to foster cognitive and emotional well-being in society.
Data Availability Statement
Code and data can be found at the following link: https://osf.io/pqtby/.
Footnotes
We thank Susan Whitfield-Gabrieli, Hause Lin, and Kenneth Harris for providing feedback on the neuroimaging analyses. This research was supported by the William and Flora Hewlett Foundation [#4429 (J.D.E.G.)] and a Natural Sciences and Engineering Research Council of Canada Postdoctoral Fellowship (A.L.D.).
↵*R.R. and J.D.E.G. are co-senior authors.
The authors declare no competing financial interests.
- Correspondence should be addressed to should be addressed to Alexandra L. Decker at alexandraleerdecker{at}gmail.com.