Abstract
Impulsive decisions arise from preferring smaller but sooner rewards compared with larger but later rewards. How neural activity and attention to choice alternatives contribute to reward decisions during temporal discounting is not clear. Here we probed (1) attention to and (2) neural representation of delay and reward information in humans (both sexes) engaged in choices. We studied behavioral and frequency-specific dynamics supporting impulsive decisions on a fine-grained temporal scale using eye tracking and MEG recordings. In one condition, participants had to decide for themselves but pretended to decide for their best friend in a second prosocial condition, which required perspective taking. Hence, conditions varied in the value for themselves versus that pretending to choose for another person. Stronger impulsivity was reliably found across three independent groups for prosocial decisions. Eye tracking revealed a systematic shift of attention from the delay to the reward information and differences in eye tracking between conditions predicted differences in discounting. High-frequency activity (175-250 Hz) distributed over right frontotemporal sensors correlated with delay and reward information in consecutive temporal intervals for high value decisions for oneself but not the friend. Collectively, the results imply that the high-frequency activity recorded over frontotemporal MEG sensors plays a critical role in choice option integration.
SIGNIFICANCE STATEMENT Humans face decisions between sooner smaller rewards and larger later rewards daily. An objective benefit of losing weight over a longer time might be devalued in face of ice cream because they prefer currently available options because of insufficiently considering long-term alternatives. The degree of contribution of neural representation and attention to choice alternatives is not clear. We investigated correlates of such decisions in participants deciding for themselves or pretending to choose for a friend. Behaviorally participants discounted less in self-choices compared with the prosocial condition. Eye movement and MEG recordings revealed how participants represent choice options most evident for options with high subjective value. These results advance our understanding of neural mechanisms underlying decision-making in humans.
Introduction
Reward value decreases as a function of time: the longer we have to wait, the less a reward is typically valued. Hence, delayed delivery of a larger reward converts an objective value into a perceived lesser value. This results in choosing a smaller but sooner (SS) rather than a larger but later reward (LL), a phenomenon known as delay discounting (DD). Impulsive decisions might be because of the tendency to prefer SS rewards compared with LL rewards. Previous fMRI studies focused on neuroanatomical correlates of subjective valuation, and observed interactions of multiple independent valuation systems in the ventromedial PFC (vmPFC) and the dorsolateral PFC (dlPFC) (McClure et al., 2004, 2007) where goal values correlate with vmPFC activity and the amount of self-control with dlPFC activity (Hare et al., 2009). Impulsivity might also result from insufficiently considered objective alternatives (Ainslie, 1975; Myerson et al., 2003; Olson et al., 2007), which have to be translated into subjective values (Mazur, 1987; Green and Myerson, 2004). This view suggests that poor attention to objective values might lead to stronger DD.
Differences in discounting between participants could also be because of insufficient representation of choice options in working memory (Fuster, 1990; Baddeley, 1992; Goldman-Rakic, 1992) before an option is selected. Importantly, self-awareness counteracts this difference in discounting (Peters and Büchel, 2010). This observation suggests that choices neglecting self-awareness should lead to steeper discounting, but two contributions to these effects are still debated. First, how attentional mechanisms and the degree of neural representation contribute to decisions in intertemporal choices remains uncertain. Specifically, the time course and neuroanatomical basis of integration of choice options (delay and reward) for subjective valuation are not defined. Second, whether and how these mechanisms are modulated qualitatively by subjective awareness are not clear. We propose that attentional selection and neural representation require more effort when deciding for oneself, which is reduced in prosocial decisions (Lockwood et al., 2017). In accordance with previous studies (Lockwood et al., 2017), we tested how taking perspective of another person alters the effort to attentional select and represent choice options. The experimental contrast relies on taking perspective of the best friend (pretending to decide for the best friend), which is a prerequisite for prosocial acts and by definition reduces self-awareness.
Using eye tracking and temporal resolution of MEG recordings, we compared patterns of attentional evaluation and representation of objective values in a DD paradigm in two conditions. In one condition, participants decided on their own reward; whereas in a second anonymous prosocial condition, they pretended to decide for an imagined best friend. Hence, the conditions differed only in the subjective awareness of reward for the participants. We tested the hypothesis that participants decide more impulsively in prosocial decisions, even if this decision is completely anonymous since participants only visualize their best friend (Lockwood et al., 2017). We further hypothesized that objective choice alternatives are less considered and consequently less represented when decisions are made for others. Eye movements are a sensitive proxy of attentional shifts, and we assessed the time course of the attentional focus using eye movement recordings and predicted less attention in choices made for others. Using MEG in an initial approach, we first evaluated whether frequency specific activity integrates choices options and second, whether differences in choice options (reward and delay in SELF vs OTHER) within the attentional focus yielded differences in neural activation.
Materials and Methods
General paradigm
In all experiments, participants were asked to choose between an LL amount of 10€ at a variable delay D (1, 2, 5, 11, 24, or 52 weeks; presented in pseudorandom order such that all delays were evenly distributed across the entire experiment) and SS reward. Participants made 10 choices for each delay while SS was adjusted according to the previous response to reach the individual indifference point with equivalent LL and SS option. SS was calculated as follows:
with k denoting the discount parameter (van den Bos et al., 2014). In the first trial, the discount parameter k was set to 0.02 in all participants and both conditions (van den Bos et al., 2014; Wang et al., 2016). Choosing SS instead of LL means that the subjective value of LL was lower than SS. Consequently, SS was decreased by increasing k. If LL was chosen, k was decreased for a higher SS. To allow a fast change toward the individual discount parameter in the beginning in the first 20 trials k was increased/decreased by 10% of k in the previous trial (kt+1 = kt ± kt·0.1; t denoting the trial number). To allow a fine-grained variation in following trials, k was adjusted by 5% of k in the previous trial (kt+1 = kt ± kt·0.05). Each trial started with a fixation point (3 s ± 200 ms) before the LL and SS choice options were presented simultaneously either on the left or right side of the screen, respectively, in pseudorandom order. The monetary reward was always presented above fixation and the delay below fixation (see Fig. 1). Participants were instructed to evaluate choice options closely before responding and to indicate their choice by pressing a left or right button with their left or right index finger in two conditions. In one condition, participants were instructed to evaluate choices and decide which amount has a higher value for themselves (self condition – SELF). In the second prosocial but anonymous condition, participants were instructed to evaluate choices regarding the presumable higher value for their best friend (other condition – OTHER) (Lockwood et al., 2017). All participants made their decisions in both conditions in separate blocks in counterbalanced order. Before blocks started, participants were instructed whether they have to make their decisions for themselves (SELF) or whether they had to pretend to decide for their best friend (OTHER). Participants were instructed before the experiment that, on completion of the experiment, one trial is randomly chosen from all presented choices (SELF and OTHER) and the respective monetary reward will be paid at the respective delay to avoid the belief that outcomes were only hypothetical. Participants were informed that this trial will be chosen randomly from both the SELF and OTHER condition to avoid biasing decisions to one of the two conditions. Prosocial behavior is commonly assessed as the willingness to benefit others. Altruism adds that this benefit has to be at their own costs. Most, if not all, theories of reciprocity, altruism, or prosocial behavior start from the assumption that perspective taking is the starting point of each social act. Our critical experimental condition is the instruction to assume the perspective of their best friend and to pretend to perform an act that (hypothetically) benefits others.
General analysis
This paradigm only allows a limited number of trials before participants arrive at their indifference point. In accord with previous studies, we used 60 trials. In addition, especially in MEG recordings, trials have to be rejected often because of artifacts. In participants with a strong trend to choose only one option across trials, this reduction would affect the option with fewer trials disproportionately stronger. As a result, averaged MEG responses could be largely determined by a prepotent motor response but not signals related to decision, especially in the OTHER condition in which we assumed that objective choice alternatives are less considered. To compare results across groups, we applied the criterion to all participants in all experiments. To identify participants with a strong trend to choose only one option across trials, we calculated a choice index (CI) as follows:
With denoting the number of SS choices and
denoting the number of LL choices. Participants with CI ≥ 0.66, that is, one option was chosen almost 5 times as often as the other option, were excluded (see Fig. 2) in both conditions. We used the average across the last 10 updated ln(k) parameters separately as proxy for individual indifference point in SELF and OTHER decisions. A higher k value indicates steeper discounting of delayed rewards and thus more impulsive choices. As the distribution of discount rates is highly right-skewed, we used log-transformed k (ln k) in all statistical analyses.
Statistical analysis
To correct statistical significance for multiple comparisons, we compared each statistical parameter against a surrogate distribution, which was constructed by randomly yoking labels of the trials and repeating the ANOVA, t tests, or Pearson's correlation coefficient. Consequently, reported p values represent the statistical significance relatively to the constructed surrogate distribution. All permutations (see detailed information for each test below) were conducted with MATLAB, and each surrogate distribution was constructed by running 1000 label permutations yielding 1000 surrogate values against which the observed statistical parameters were compared. Significance criterion in all these tests was statistical parameters corresponding with p < 0.025. These critical values represent the 97.5% confidence and are denoted as Fcrit, Tcrit, or rcrit. This approach counters the possibility that MEG and/or eye tracking data might be differently distributed. In general, bootstrap methods have the advantage of accounting for the dependence structure of p values. Bootstrapping offers a way to deal with situations in which the test statistic may not follow the distribution assumed by large sample theory.
Power of correlation analysis
We conducted individual correlation analyses (e.g., gaze stability in eye tracking vs discount parameter or MEG activity vs discount parameter) where the number of participants has to be high to avoid false positives. Too few participants might not correctly reflect a small correlation leading to the false assumption that there is no correlation despite a de facto but low (e.g., ρ = 0.3) correlation at the population level. Hence, a high number of subjects is needed for low correlation strength, but fewer participants can correctly reveal higher correlations at the population level (e.g., ρ = 0.6). Increasing the number of participants without changing mean and SD decreases the p value and hence increases the likelihood of exaggerating small effects. The number of participants used in our study are a trade-off between both approaches: a well-powered individual differences analysis without inflating mean effects. We determined the power of the maximal correlation coefficient r using G*Power 3.1 (Erdfelder et al., 2009). The power values are given as β separately for each analysis.
Reaction time experiment
Choice options were presented on the computer screen spanning a visual field with a horizontal visual angle of 9.5° and vertical visual angle of 1.4°. Participants were instructed to closely evaluate the choice options and press one of two buttons indicating their choice when they had the feeling that they made their decision. If they did not respond following 10 s, the next trial started (see Fig. 1A). We compared mean decision time and ln (k) parameters across participants between conditions using separate paired t tests (see Fig. 3A,B).
Eye tracking experiment
Participants were instructed to withhold decision until the fixation point turned red (3 s ± 200 ms following presentation of choice options; sufficiently long to exceed mean decision time in the first experiment) to eliminate any impact of button presses (see Fig. 1B). Choice options were presented further away from each other spanning an area of 41 × 15.5 cm (horizontal visual angle of 32° and vertical visual angle of 12°) to clearly distinguish gaze direction to the delay and reward. We used for eye movement recording the Eyelink 1000 system operated on Windows 7 and a desktop-mounted Eyelink CL camera with a TV lens (35 mm 1:1.6). All participants used a chin and forehead rest with 71.5 cm distance to the monitor and 59 cm to the camera. Visual stimuli were presented on a Samsung Syncmaster 2233 (22 inch) with a resolution of 1680 × 1050. In each participant, we tracked the pupil diameter and corneal reflex of the left eye with a sampling rate of 2000 Hz. Before each trial block, we performed a calibration session with the built-in 9-point grid method.
Data preprocessing
The resulting eye tracking data (time series of vertical eye movements to the reward presented above and delay below fixation cross) were used to characterize temporal dynamics of evaluation of delay and monetary reward. First, we identified trials (−1 to 4 s around stimulus onset) with low fixation in the baseline period (−1 s to stimulus onset). That is, we calculated for each trial the mean y coordinate (vertical eye movements between reward and delay). Trials with a mean y value >2 SDs (indicating low fixation) of all trials were excluded from analyses. Since delay and reward were always presented below and above fixation, respectively (LL and SS options were presented pseudorandomly to the left and right), we focused on analyzing y coordinates across time. We baseline-corrected by subtracting from each data point (y coordinate) the mean y coordinate within the −1 s preceding the stimulus onset in each trial. To define gaze direction as a function of time across trials, we then calculated at each time point a histogram of all y coordinates (see Fig. 4B) separately for each participant. This results in high probability values for fixation before stimulus onset and high probability values for delay and reward following stimulus onset. These can be identified as colored bands in front of an otherwise dark blue background (locations on the screen where participants did not look at consistently) in Figure 4B. From these probability maps, we extracted three time series defined by gaze to delay Dt, to fixation Ft, and monetary reward Rt, representing time varying probability to look at delay, fixation, and reward, respectively. Since participants did not look exactly at one location, we defined a spatial margin around each of the three spatial regions based on the real eye movements. To this end, individual probability maps were averaged across participants and conditions and across time, leading to three probability maxima, which correspond with spatial location of delay, fixation, and reward (see Fig. 4B, right, dashed lines). The margin around these maxima is defined by the inflection points (solid lines). Probability values around the three maxima corresponding with spatial location on the screen (see Fig. 4B) were averaged at each time point, leading to three time series for each participant and condition. Figure 4B (bottom) shows the time series for delay and reward.
Data analysis
Discounting differences. First, we compared ln (k) parameters across participants to test whether participants discounted differently between conditions (see Fig. 4A) with a paired t test.
Intervals of option evaluation. To test for difference of gaze direction, we conducted a t test at each time point between Dt and Rt time series across participants, leading to TDR time series. This time series shows when participants inspected on average more the delay or the reward. The level of significance of TDR was corrected for multiple comparisons by comparing each TDR value against a surrogate distribution. This surrogate distribution was constructed by randomly reassigning the labels (delay vs reward) to the single participants in 1000 permutations. This leads to 1000 surrogate TDR value time series. Significance criterion was a TDR value with p < 0.025 within the surrogate distribution of all T values (see Fig. 4B).
Differences in option evaluation. We tested whether participants evaluated choice options differently in both conditions. To test for difference of gaze direction between conditions, we conducted a t test at each time point for Ft, Dt, and Rt across participants, leading to three t value time series (TF, TD, and TR) capturing the differences between conditions. The level of significance of each of the three t value time series was corrected for multiple comparisons by comparing each t value time series against a surrogate distribution as above but swapping labels (SELF vs OTHER). Significance criterion was a t value with p < 0.025 within the surrogate distribution of all surrogate t values (see Fig. 4C).
Correlation of gaze entropy and decision. We hypothesized that differences in decision-making result from differences in choice evaluation. Dt and Rt time series representing the probability to look at the delay and the reward were used to estimate the entropy, which gives the average of information of all events (here gaze at delay and reward) and reflects the predictability or stability of gaze direction. If gaze entropy is high, it is hard to predict whether participants look at the delay or the reward; but if gaze entropy is low, then gaze direction is stable and prediction is high. In both conditions, entropy was calculated at each time point t as follows:
With i denoting the reward and delay, p as the likelihood to direct gaze at one of the two events. The resulting entropy time series HOTHER was subtracted from HSELF, leading to an HΔ time series for each participant. Individual differences in discount parameters were correlated with HΔ values at each time point, leading to a new Pearson's correlation r time series. The level of significance of r was corrected for multiple comparisons by comparing each r against a surrogate distribution. This surrogate distribution was constructed in the following way. For each iteration, we randomly assigned the individual entropy values HΔ across participants. We then correlated these randomly assigned values with the individual discount values of our participant in 1000 permutations. This leads to 1000 surrogate r value time series. Significance criterion was a r value with p < 0.025 within the surrogate distribution of all r values (see Fig. 4D).
MEG recordings
Data acquisition
In a third group, participants (N = 24; 7 female; mean age: 26.17; SD = 5.16) were seated in a magnetically shielded room in which mMEG activity was recorded while participants performed the experiment. To record vertical and horizontal eye movements, electro-oculographic activity was obtained. Electrode impedance was kept <10 kΩ. For the data acquisition, a whole-head, 102-channel magnetometer array (Elekta Neuromag TRIUX) with internal helium recycler has been used. The MEG system contains 102 sensor fields, each equipped with one magnetometer measuring the normal field component and two orthogonally oriented planar gradiometers for measuring the gradient components. The participants sat in an upright position underneath the MEG “helmet.” MEG data were sampled at 2000 Hz with a bandpass filter from DC to 660 Hz.
First, we tested whether participants discount more in the OTHER condition compared with the SELF condition comparing log-transformed discount parameters between conditions. Participants were instructed to withhold decision until the fixation point turned red (3 s ± 200 ms following presentation of choice options; sufficiently long to exceed mean reaction time in the first experiment) to eliminate the impact of button presses (see Fig. 1C). In the eye tracking experiment, we forced participants to shift gaze and hence their attentional focus. Here, to suppress eye movements, choice options were presented on the screen spanning an area of 3 × 12 cm (horizontal visual angle of 6.8° and vertical visual angle of 1.79°). This allows us to assess potential intervals when delay and reward are within the attentional focus, ruling out the possibility that differences in MEG activity result from eye movements. Furthermore, all trials with activity >2 SDs of all electro-oculographic trials were rejected.
Preprocessing
We used MATLAB 2013b (The MathWorks) for all offline data processing. All filtering (see below) was done using zero phase-shift IIR filters. First, we used an absolute threshold of 300 fT to discard signal epochs of excessive, nonphysiological amplitude. We then visually inspected all data, excluded epochs exhibiting excessive muscle activity, as well as time intervals containing artifactual signal distortions, such as signal steps or pulses. We refrained from applying artifact reduction procedures that affect the dimensionality and/or complexity of the data (e.g., independent component analysis). The raw signal of all remaining epochs was filtered between 1 and 275 Hz. A notch filter was applied to remove line noise (±2 Hz around the first 5 harmonics) before filtering in specific frequency bands (see below).
Data analysis
Discounting differences. First, we compared ln(k) parameters across participants to test whether participants discounted differently between conditions (see Fig. 5A).
Choice options related to amplitude modulation. Next, we tested whether brain activity shows significant amplitude modulation to presentation of choice options. For each trial (−2 s to 6 s around stimulus onset – sufficiently long to prevent any edge effects during filtering), we bandpass filtered each electrode's time series at 37 frequency bands (log-spaced between 1 and 330 Hz) with a bandwidth of 15% of the center frequency. We obtained the analytic amplitude of each frequency f by Hilbert-transforming the filtered time series. We smoothed the time series such that the amplitude value at each time point n is the mean of 10 ms around each time point n. We then baseline-corrected the brain activity by subtracting the mean activity from the −1 to 0 s preceding the stimulus onset in each trial of each magnetometer.
We then identified stimulus-responsive frequency bands showing a significant amplitude modulation in each frequency band following the onset of choice display. We first averaged Af across all trials, magnetometers, and participants, resulting in one amplitude time series for each frequency. We then calculated the average baseline activity across the 500 ms preceding the stimulus onset. For each frequency band activity, we subtracted
from the activity modulation
averaged across the 3 s following the stimulus onset. To control the significance threshold for multiple comparisons, the difference between
and
was compared against an empirical distribution derived from randomly shifted time series (Npermutations = 1000). In each iteration, time series of each channel (circular shift of the entire trial time series) separately and new (surrogate) trial averages (
were calculated from the shifted trials. Frequency bands exceeding the 97.5th or below the 2.5th percentile of the frequency specific surrogate
distribution (see Fig. 5B, dashed gray lines) were classified as showing a significant amplitude modulation following presentation of choice options.
The previous analysis informed us which frequency band showed a significant modulation to presentation of choice options. Next, we tested which channels contributed significantly to the stimulus response modulation in frequency bands with significant amplitude modulation. Hence, we filtered raw time series in the broader frequency bands showing significant amplitude modulation found in the previous analysis. We first averaged Af across all trials and participants and repeated the analysis as outlined above. Magnetometer's signals exceeding the two-sided 95th percentile of the surrogate distribution were classified as showing a significant amplitude modulation following presentation of choice options (see Fig. 5B).
Amplitude modulation with delay. In the next step, we tested whether these frequency bands code objective values and hence showed amplitude modulation as a function of delay and/or reward. First, we grouped trial activity recorded at magnetometers with significant amplitude modulation according to the six different delays and averaged across all participants. This was done since the six different delays were the same in all participants. This leads to six new time series for each magnetometer, each representing the mean activity modulation to one of the six delays. At each time point, we linearized amplitude differences between these 6 time series by assigning a rank value to each of the 6 amplitude values (1 being the lowest amplitude and 6 representing the highest amplitude). Integer ranks can help to stabilize effects, which can be obscured because of these fluctuations. This was used to identify temporal intervals. To corroborate our hypothesis, statistical tests were conducted using the real data. Next, we tested whether these ranks as a proxy for the amplitude values correspond with the delays. A rank order matching the number of delay (1 [shortest delay] to 6 [longest delay]) would indicate that a given frequency band responds with a gradually increasing amplitude modulation to a gradually increasing delay. To test this, we used a linear least square fit to the rank values at each time point. This results in a new slope time series with positively/negatively highest slopes when amplitude modulation varies with delay and slopes at ∼0 when amplitude is not modulated by the delay. The level of significance was corrected for multiple comparisons by comparing each slope value against a surrogate distribution. This surrogate distribution was constructed by randomly reassigning the labels (delay 1-6) to the six time series in 1000 permutations for each channel. This leads to 1000 surrogate slope values. Frequency bands exceeding the two-sided 95th percentile of the surrogate distribution were classified as showing a significant correlation of amplitude and delay. To assess differences between frequency bands, we used a time point-by-time point one-way ANOVA to test for differences of slope values across magnetometers. We determined the empirical significance threshold for F values by randomly reassigning the frequency band labels in 1000 permutations of the same time point-by-time point ANOVA. Last, in the temporal interval of significant amplitude-by-delay covariation, we used a one-way ANOVA to test for differences of amplitude values between 6 delays across participants. The same analysis was conducted in the OTHER condition (see Fig. 5C).
Correlation with reward. To test whether amplitude modulation varied as a function of the reward, we correlated the trial-to-trial variation of reward with the trial-to-trial variation of amplitude averaged across MEG sensors showing significant amplitude modulation at each time point. This analysis was done in each participant since monetary reward values depended on individual decisions and consequently differed across participants. This results in a Pearson's r time series for each participant. The level of significance was corrected for multiple comparisons by comparing each r value against a surrogate distribution. This surrogate distribution was constructed by randomly reassigning the amplitude values of one participant to the discount values of another participant in 1000 permutations. This leads to 1000 surrogate r value time series. Significance criterion was an r value with p < 0.025 within the surrogate distribution of all r values (see Fig. 5D).
Correlation of MEG response and discount differences. Finally, we assessed whether differences in amplitude modulation could explain differences of discounting behavior between conditions. We averaged in frequency bands showing significant amplitude modulation across MEG sensors representing objective choice options in both conditions and calculated the difference time series (Δamplitude) for each participant. At each time point, we calculated Pearson's r between Δamplitude and Δk denoting the difference in discount parameter in each participant. The level of significance was corrected for multiple comparisons by comparing each r value against a surrogate distribution. This surrogate distribution was constructed by randomly reassigning the amplitude values of one participant to the discount values of another participant in 1000 permutations. This leads to 1000 surrogate r value time series. Significance criterion was an r value with p < 0.025 within the surrogate distribution of all r values (see Fig. 5E).
Results
In all experiments, participants were asked to choose between a LL amount of 10€ at a variable delay D (1, 2, 5, 11, 24, or 52 weeks; presented in pseudorandom order) and SS (now) reward (for a detailed description of the paradigm, see Materials and Methods; Fig. 1). Participants made 10 choices for each delay in pseudorandom order, while SS was adjusted according to the previous response to reach equivalent LL and SS options (see Materials and Methods; Table 1).
Depiction of choice options presentation. A, Reaction time experiment. B, Eye tracking. C, MEG experiment. In the first experiment, we tested for differences in decision times; hence, participants were asked to respond as fast as possible (A). B, C, Both in the eye tracking and MEG experiment, participants were presented with the choice options and instructed to withhold decision until fixation points turned red. B, During eye movement recording, choice options were presented further apart.
Participants characteristics and exclusion criteria
Decision time experiment
In the first experiment, we tested 22 participants (13 female; mean age 24.1 years; SD = 5.16; all righthanded with normal or corrected-to-normal vision) (1) for differences in decision time and (2) whether they discounted differently between both conditions (analysis steps are explained in more detail in Materials and Methods). We calculated a CI, which parameterizes how evenly participants chose both options. Six participants strongly preferred one choice option (see Materials and Methods; CI ≥ 0.66), which means that one option was chosen 5 times as often as the other one and were excluded. The remaining participants did not differ on average in their CI (CISELF = 0.21, SD = 0.15; CIOTHER = 0.26, SD = 0.14; t(15) = 1.5; p = 0.15; Fig. 2A) and in their decision time (RTSELF = 2.36 s, RTOTHER = 2.34; t(15) = 0.16; p = 0.8; Fig. 3A). However, they discounted stronger in the OTHER condition (t(15) = 2.3; p = 0.03) as indicated by higher discount parameters (mean = −3.7; SD = 0.23; mean
= −4.3; SD = 0.29; Fig. 3B).
CI distribution. A–C, Participants chose LL and SS options equally often and did not differ between conditions. Error bars indicate SEM. D, Distribution of choice indices over all experiments with our cutoff at CI ≥ 0.66.
Eye tracking experiment
To further assess OTHER condition discounting effects, we compared the discounting parameters ln(k) (the natural logarithm of the k parameter adjusted throughout the experiment) in a second group (N = 20, 10 female; mean age 22.58 years; SD = 2.06). Eye tracking data (time series of vertical eye movements between reward presented above and delay below fixation cross) were used to characterize temporal dynamics of evaluation of delay and monetary reward. Analysis steps are explained in more detail in Materials and Methods. In general, we tested for differences in discounting behavior (see Discounting differences), whether participants showed on average a consistent chronology of delay and reward evaluation (see Intervals of option evaluation), whether participants evaluated choice options differently in both conditions (see Differences in option evaluation). These analyses test for stability choice option evaluation (gaze entropy; reflects the predictability or stability of gaze direction). In the final step, we tested whether discounting differences can be explained by differences in choice evaluation (see Correlation of gaze entropy and decision).
Discounting differences
One participant was excluded because of his strong bias toward one option (CI ≥ 0.66). The remaining participants showed on average no difference in CI (CISELF = 0.24, SD = 0.17; CIOTHER = 0.15, SD = 0.11; t(18) = 1.7; p = 0.11; Fig. 2B). Replicating Experiment 1, participants discounted stronger in the OTHER condition (t(18) = 3.5; p = 0.0025; Fig. 4A) as indicated by higher discount parameters (mean = −3.9; SD = 0.84; mean
= −4.8; SD = 0.62; Fig. 4A).
Intervals of option evaluation
We tested the chronological order of the inspection of delay and reward. Participants tended to inspect delay first (Fig. 4B, I; significant difference to reward between 182 ms and 816 ms; tcrit = ± 2.05; tmax = 12.35 at 536 ms; p < 0.000001). Participants then inspected the reward (Fig. 4B, II) between 998 and 1201 ms indicated by higher probability for gaze at reward compared with delay (tmax = 2.45 at 1102 ms; p = 0.01). Third, they returned to the delay (Fig. 4B, III) between 1447 and 2110 ms (tmax = 3.68 at 1628 ms; p = 0.002) with higher probability for gaze at delay compared with reward.
Differences in option evaluation
Here we tested for temporal differences of inspection as a function of our experimental conditions. We found that participants more closely inspected the delay (I) in the SELF condition (Fig. 4C, bottom) in the temporal interval from 367 to 626 ms (tcrit = ± 2.075; tmax = 2.8 at 416 ms; p = 0.0038). Second, we found that participants more closely inspected the reward (II) in the SELF condition (Fig. 4C, top) in the temporal interval from 686 to 1032 ms (tmax = 2.95 at 925 ms; p = 0.0026). Third, when participants returned to the delay information, they also more closely inspected the delay (III) in the SELF condition between 1434 and 1634 ms (tmax = 2.36 at 1686 ms; p = 0.012).
Correlation of gaze entropy and decision
In the last step, we tested whether differences in gaze entropy (how stable participants looked at choice options) between conditions correlate with differences in discount parameter between conditions. We found a significant correlation (critical r value was ± 0.41) between differences in discount parameter and differences in gaze entropy between 628 and 778 ms (rmax = 0.66 at 722 ms, β = 89%; p = 0.0025; Fig. 4D).
MEG experiment
In a third group of participants (N = 24; 7 female; mean age: 26.17 years; SD = 5.16 years), we assessed discounting behavior (see Discounting differences) using MEG activity across specific frequency bands (see Choice options-related amplitude modulation). We then tested whether these specific frequency bands represent the delay (see Amplitude modulation with delay), and/or the reward (see Correlation with reward) and whether differences of MEG activity between conditions correlate with differences in discounting behavior (see Correlation of MEG response and discount differences). Analysis steps are explained in more detail in Materials and Methods.
Discounting differences
Five participants were excluded since >30% of trials had to be rejected because of artifacts (we recorded only 60 trials in each condition). None of the remaining participants had to be rejected because of a strong bias toward one option (CI ≥ 0.66). The remaining participants did not differ with respect to CI (t(18) = 1.34, p = 0.2) but show differences in discounting (t(18) = 2.85, p = 0.01) consistently with the other two groups (Fig. 5A).
Choice options-related amplitude modulation
We found a low-frequency band (LF: 6-35 Hz) and high-frequency band (HF: 150-275 Hz) with significant amplitude decrease compared with baseline bilaterally located over occipital cortex and over right frontotemporal cortex, respectively (Fig. 5B). Additionally, we found γ frequency (50-70 Hz) increase over baseline following stimulus onset in a central and occipital ROI (Fig. 5B).
Amplitude modulation with delay
Only HF amplitude varied with the delay information (critical slope value: 0.69) between 138 and 643 ms (slopemax = 0.78 at 411 ms; p = 0.03; Fig. 5C). In the SELF condition, we found differences in amplitude modulation depending on the delay presented in this interval (F(5,108) = 2.4; p = 0.039). Post hoc tests revealed a significant difference between 1 and 11 weeks (p = 0.04) and 1 and 52 weeks (p = 0.0003), 2 and 52 weeks (p = 0.01), and 5 and 52 weeks (p = 0.03). The OTHER condition exhibits no significant differences (F(5,108) = 0.5). In addition, we found a highly significant interaction between the OTHER and SELF condition (F(5,216) = 9.06; p < 0.00001).
Correlation with reward
The HF amplitude was correlated with reward between 576 and 876 ms (rmax = 0.05; p = 0.0009 at 816 ms; Fig. 5D), which is also corroborated by a significant difference between frequency bands between 741 and 826 ms (Fcrit = 3.6; Fmax = 5.8 at 756 ms; p < 0.00001). Importantly, in the OTHER condition, HF amplitude was not correlated with the monetary reward. Next, we compared directly the correlation values between both conditions at each time point. Figure 5D shows the resulting t value time series. We found that correlation values differed between conditions (561 and 861 ms; tmax = 4.3 at 741 ms; p < 0.000001). This analysis revealed that the significant reward correlation in the SELF condition is not only significant but also larger than in the OTHER condition.
Correlation of MEG response and decision
We found that the differences of the HFA predicted differences in discounting behavior between 96 and 310 ms (rmin = −0.5 at 286 ms; p = 0.01, β = 0.64; Fig. 5E). Although significant, the power of this analysis is relatively low compared with the correlation analysis in the eye tracking paradigm and the follow-up analysis (see below). Moreover, the individual differences analysis was conducted in a time-resolved manner. Hence, there is considerable variation over time and only for a specific temporal interval we can reliably define a correlation. We hypothesize that future studies, designed to specifically test for an intermediate correlation strength using more participants, might be able to delineate the temporal evolution of individual differences between MEG responses and decisions with higher accuracy.
Follow-up analysis
We tested whether discounting behavior in the SELF condition was correlated with the OTHER condition across the three different groups. Differences in discounting parameters accompanied by a significant correlation indicate a similar baseline mechanism for decision-making, which utilizes a different level of objective information. We found, across all experiments, ln (k) parameters correlated between both conditions (r = 0.48; p = 0.0003, β > 0.95; Fig. 6).
Reaction time experiment. A, Participants did not show differences in decision times between conditions that were made within 3 s. B, Discount curves for both conditions. The discount parameters (small inset) were significantly different with stronger discounting in the prosocial condition. Error bars indicate SEM.
Eye tracking results. A, Stronger discounting in the OTHER condition as indicated by differences in discount parameters (small inset). Error bars indicate SEM. B, Color-coded gaze probability as a function of time to choice option onset (x axis) averaged across all participants and conditions. y axis indicates the location of reward, fixation, and delay presented on the screen. Yellow band before stimulus onset (magenta vertical line) indicates that participants looked at the fixation cross. Following stimulus onset, participants directed gaze to delay and reward as indicated by light green and light yellow streaks across time. Right, Average across time. Peaks (dashed lines) correspond to exact locations on the screen of reward, fixation, and delay. Bell-shaped averaged probability values indicate that participants inspected reward and delay with spatial variability. Probability values within inflection points (solid lines) were averaged, leading to time series representing gaze probability to delay, reward, and fixation. Bottom, A consistent chronology of first delay and then reward inspection. Black and red lines indicate probability to direct gaze to delay and reward, respectively, averaged across participants. First gray shaded area represents the interval in which participants gazed to delay more than to reward as indicated by a significant t value; the second gray shaded area, vice versa. C, Time-varying probability to direct gaze to fixation in both conditions averaged across participants. Gray shaded area represents the temporal interval, with significant difference between SELF (blue) and OTHER (green) condition indicating that participants disengaged slower from fixation in the OTHER condition. Bottom, Gaze direction to the delay (black frame) separately for both conditions. Top, Gaze direction to reward (red frame) separately for both conditions. Gray shaded areas represent intervals of significant differences between conditions, indicating that both delay and reward were inspected more closely in the SELF compared with the OTHER condition. D, Correlation of differences of gaze entropy and difference of discount parameter between conditions at the time point shown in the left small diagram. Greater differences in gaze entropy are correlated with greater differences in discounting.
MEG results. A, Behavioral results. B, Amplitude modulation in three different frequency bands with topographical distributions. Two gray dashed lines indicate the upper and lower confidence interval (obtained from a permutation test) and hence distinguishes the frequency bands with significant amplitude decrease (below the lower gray line) from frequency band with significant amplitude increase (above the upper gray line). In each frequency band, we determined MEG sensors with significant amplitude modulation over baseline based on a permutation procedure. C, Only HFA shows correlation with delay. Differences in amplitude modulation as a function of delay was only found in the SELF but not in the OTHER condition. Error bars indicate SEM. D, In a temporal interval following the coding of the delay, we found significant correlation with reward information only in the high frequency band and exclusively in the SELF condition. E: Differences in amplitude modulation between conditions were correlated with differences of discounting between conditions.
Follow-up analysis. Correlation of discount parameters between conditions across all three experiments: reaction time (RT), eye tracking (ET), and MEG.
Discussion
Studies on DD have focused on the representation of subjective value signals by contrasting differential activation associated with smaller and sooner versus larger and longer choices (McClure et al., 2004; He et al., 2012; Kim et al., 2012; Cooper et al., 2013; Peper et al., 2013; van den Bos and McClure, 2013; van den Bos et al., 2014, 2015). Impulsivity, the choice of SS, is associated with preferring sooner rewards compared with later rewards or to insufficiently considering objective alternatives (Ainslie, 1975; Myerson et al., 2003; Olson et al., 2007).
We tested whether subjective value (by contrasting self-referential decisions with prosocial decisions) counteracts impulsivity. There is considerable debate whether humans are truly prosocial. In intertemporal choices, it is unclear whether either a higher reward or an earlier gratification is prosocial. Previous studies showed that participants put in more effort when acts benefited themselves (Lockwood et al., 2017). In our experimental settings across different groups, participants showed impulsive decisions (discounted less in self-referential compared with prosocial decisions) but did not show condition decision time differences seen in impulsive actions (Cho et al., 2010; Wang et al., 2016; Yates et al., 2016). Hence, we hypothesized that differences in discounting result from differences in depth of evaluation of choice options. Therefore, prosocial decisions provide the opportunity to test how choice option integration in human participants is represented and whether information, which pertains to ourselves, is processed differently compared with prosocial decisions.
In order to elucidate the temporal evolution of the option integration process itself, we compared patterns of attentional reallocation (eye tracking) and choice option integration (MEG). In our eye tracking experiment, the increased spatial distance between choice options on the screen enhanced the effort to move the eyes, corroborating predictions of less effort in the prosocial condition (social apathy) (Lockwood et al., 2017). Patterns of choice option evaluation were similar explaining why decision times are not different but gaze stability differed between conditions, explaining interindividual differences in discounting behavior. We hypothesize that less attentional orientation limits representing and integrating choice options. MEG was recorded in a separate experiment with minimal spatial extent of choice option presentation on the screen, minimizing eye movement contamination of the temporal evolution of choice option integration allowing replication of behavioral results in two independent groups. To compare eye tracking with MEG results, we constructed time series representing the probability to look at a given spatial point as a function of time. We observed that intervals of attentional shift to the delay and reward in the eye tracking experiment were paralleled by intervals of representation of delay and reward in MEG in the HFA.
Intertemporal choices in DD experiments refer to the trade-off between benefits and costs, which can be either an increasing delay (“wait”) or increasing effort (“work”) to obtain the reward (Phung et al., 2019). Both can be dissociated by underlying neuronal circuits driving behavior toward reward maximization and effort minimization (Prévost et al., 2010; Massar et al., 2015; Klein-Flügge et al., 2016) accompanied by differences in discount curves with an inverted sigmoid function and hyperbolic function in effort and DD, respectively. We did not see an initial concave shape in the prosocial condition typical for effort discounting arguing against the notion that both conditions recruit different neuronal networks. Instead, we propose that differences in discounting result from differences in choice option evaluation. This provides evidence that evaluating gaze stability is a proxy of attentional direction to choice options, and predicts DD.
The MEG study showed that only high-frequency activity (HFA; 175-250 Hz) was modulated by choice options exclusively in the self-referential condition, providing evidence that activity distributed in MEG sensors over frontotemporal regions reflects integration of delay and the reward in humans. HFA is assumed to reflect nonrhythmic synaptic activity (Buzsáki et al., 2012) and is a key marker of cortical activation (Edwards et al., 2005; Ray et al., 2008). Intracranial recordings of HFA response dynamics in humans have enhanced our understanding of cortical information integration in attention, language, memory, emotion, decision-making, and motor control (Johnson et al., 2020). These studies imply that HFA acts as an index of local cortical computation (Buzsáki et al., 2012; Rich and Wallis, 2017). HFA bridges a long-standing gap to fMRI studies on DD. Power modulation in higher frequencies has been shown to explain BOLD responses better than activity in lower frequencies, which are instead thought to reflect activity in broadly distributed networks (Nir et al., 2007; Mukamel et al., 2014). The temporal precision of MEG adds to the spatial resolution of fMRI and is able to delineate mechanisms of choice option integration in time, which is in line with previous studies on humans and nonhuman primates showing that HFA captured reward-related information (Hunt et al., 2015). HFA has been regarded as a good measure of neuronal spiking (Liu and Newsome, 2006; Berens et al., 2008), consistent with the idea that HFA reflects aggregate local neuronal output (Buzsáki et al., 2012) because of high correlations between HFA and multiunit activity. Both can distinctively be localized in granular/infragranular and supragranular layers, respectively, in V1 and A1 in monkeys and PFC in humans. Supragranular HFA contributes significantly more to the surface field potential than deeper layers, and it is argued that HFA may contain a substantial representation of input from cortical feedback pathways (Leszczyński et al., 2020). Recent single-neuron studies in monkeys provide insight into the neural mechanism for the estimation of interval time (Brody et al., 2003). Single-unit activity is modulated by the amount of an expected reward (Leon and Shadlen, 1999; Wallis and Miller, 2003) and encodes the relative reward value (Tremblay and Schultz, 1999; Cai et al., 2011). Furthermore, single-neuron recordings in pigeons showed that neural delay activity was modulated by increasing delay length and additionally covaried with expected reward amount (Kalenscher et al., 2005).
The right dlPFC, associated with executive and control functions, potentially represents choice options (Bickel et al., 2009; Achterberg et al., 2016) and response selection among the most advantageous (Ho et al., 2016). Our results indicate that self-referential decisions are characterized by a response selection suppression mitigating impulsive decisions. Activity in the medial and right dlPFC is also positively correlated with self-risk (Hu et al., 2017) and cathodal transcranial direct current stimulation (tDCS), associated with inhibition, reduces impulsivity and risky behavior in Parkinson patients (Benussi et al., 2017). Thus, lower levels of activity are associated with less impulsivity, which is in line with a relative HFA reduction during SELF compared with OTHER decisions in our study. Importantly, we did not evaluate the activity level of impulsive versus patient decisions but the neural activity accompanying choice option presentation preceding these decisions. The disruption of right dlPFC with low-frequency repetitive transcranial magnetic stimulation reduces emotional weight during decision-making in social contexts (Tassy et al., 2012), indicating that the dlPFC actually integrates objective choice options. These findings are controversial since the impact of tDCS on the right dlPFC is complex (low risk aversion in gain frames after tDCS but high-risk aversion in loss frames after stimulation) (Ye et al., 2016). Other studies showed that the right dlPFC mediates action value comparisons in value-based decision-making (Morris et al., 2014) and plays a causal role in the computation of values of choices (Camus et al., 2009).
Mapping of sources to sensors is ill-posed in noninvasive recordings. Hence, a multitude of different source combinations could generate the field pattern. Only direct intracranial recordings can reliably distinguish anatomic localizations. Furthermore, we operationalized prosocial acts by taking perspective of the best friend. It could be argued that participants did not perform genuine prosocial acts since they did not directly benefit others because of the hypothetical outcomes. How real rewards, directly paid to others, influence prosocial decisions in intertemporal choices should remain to be determined. Moreover, future studies can test for an intermediate correlation strength using more participants to delineate with higher accuracy the temporal evolution of correlation between MEG responses and decisions.
Intertemporal behavior has emphasized trait-like variance (Luo et al., 2014). We found correlation between the discount parameter in the SELF and OTHER condition arguing in favor of an individual disposition to discount delayed values. In sum, we argue that impulsivity does not result from oversensitivity to one option but a lack of attentional allocation to choice options. The HFA measured over the right frontotemporal cortex shows broadband amplitude modulation only when decisions have a high value for the self but not during anonymous prosocial decisions. Intervals of delay and reward representation match intervals of gaze toward delay and reward. In sum, our results highlight a unique role of high frequency band activity recorded over the right frontotemporal cortex representing objective values important to suppressing impulsivity.
Footnotes
This work was supported by National Institute of Neurological Disorders and Stroke Grant R37NS21135, DFG/SFB 779 TP A02, Autonomie im Alter (EFRE).
The authors declare no competing financial interests.
- Correspondence should be addressed to Stefan Dürschmid at sduersch{at}lin-magdeburg.de