Abstract
Neurophysiological work in primates and rodents have shown the amygdala plays a central role in reward processing through connectivity with the orbitofrontal cortex (OFC) and hippocampus. However, understanding the role of oscillations in each region and their connectivity in different stages of reward processing in humans has been hampered by limitations with noninvasive methods such as poor spatial and temporal resolution. To overcome these limitations, we recorded local field potentials (LFPs) directly from the amygdala, OFC and hippocampus simultaneously in human male and female epilepsy patients performing a monetary incentive delay (MID) task. This allowed us to dissociate electrophysiological activity and connectivity patterns related to the anticipation and receipt of rewards and losses in real time. Anticipation of reward increased high-frequency gamma (HFG; 60–250 Hz) activity in the hippocampus and theta band (4–8 Hz) synchronization between amygdala and OFC, suggesting roles in memory and motivation. During receipt, HFG in the amygdala was involved in outcome value coding, the OFC cue context-specific outcome value comparison and the hippocampus reward coding. Receipt of loss decreased amygdala-hippocampus theta and increased amygdala-OFC HFG amplitude coupling which coincided with subsequent adjustments in behavior. Increased HFG synchronization between the amygdala and hippocampus during reward receipt suggested encoding of reward information into memory for reinstatement during anticipation. These findings extend what is known about the primate brain to humans, showing key spectrotemporal coding and communication dynamics for reward and punishment related processes which could serve as more precise targets for neuromodulation to establish causality and potential therapeutic applications.
SIGNIFICANCE STATEMENT Dysfunctional reward processing contributes to many psychiatric disorders. Neurophysiological work in primates has shown the amygdala, orbitofrontal cortex (OFC), and hippocampus play a synergistic role in reward processing. However, because of limitations with noninvasive imaging, it is unclear whether the same interactions occur in humans and what oscillatory mechanisms underpin them. We addressed this issue by recording local field potentials (LFPs) from all three regions in human epilepsy patients during monetary reward processing. There was increased amygdala-OFC high-frequency coupling when losing money which coincided with subsequent adjustments in behavior. In contrast, increased amygdala-hippocampus high-frequency phase-locking suggested a role in reward memory. The findings highlight amygdala networks for reward and punishment processes that could act as more precise neuromodulation targets to treat psychiatric disorders.
Introduction
Reward and punishment exert a powerful influence on a vast array of cognitive, behavioral, and emotional processes necessary to survive and thrive in a volatile world. Detailed information about the neural basis of reward processing has been garnered from primates and rodents using focal lesions and electrophysiological recordings with exceptional spatial and temporal precision. These studies have shown amygdala and orbitofrontal cortex (OFC) neurons encode the value of rewarding and aversive conditioned and unconditioned stimuli (Padoa-Schioppa and Assad, 2006; Paton et al., 2006; Belova et al., 2008; Morrison and Salzman, 2009; Bermudez and Schultz, 2010; Jezzini and Padoa-Schioppa, 2020). Reward guided behavior depends on an interaction between the two regions and is disrupted by lesions of either region or the connecting white matter tracts (Baxter et al., 2000; Hampton et al., 2007; Morrison et al., 2011; Rudebeck et al., 2013, 2017b; Chau et al., 2015; Fiuzat et al., 2017; Murray and Fellows, 2022). The OFC is well placed to compare and choose between stimuli or actions by combining value from the amygdala with crucial mnemonic inputs about stimuli, context, and task from the hippocampus (Rudebeck and Murray, 2014; Bocchio et al., 2017; Knudsen and Wallis, 2020).
Unfortunately, we know little about how these findings apply to humans because of limitations with noninvasive methods. Human lesions are rarely focal, fMRI does not directly measure neural activity, can represent excitatory or inhibitory activity, has limited temporal resolution, and can suffer from signal dropout in OFC. MEG/EEG has good temporal resolution, but limited spatial resolution and EEG is insensitive to high-frequency activity because of distortion by tissues between the brain and electrodes. Developments in implantation and localization methods have led to a rise in the use of recordings from intracranial electrodes implanted to determine regions for resection to alleviate epilepsy (Parvizi and Kastner, 2018). Intracranial recordings can fill an important niche in neuroscientific studies of reward as they have high spatial and temporal resolution, are a direct measure of neural activity and record from many of the regions important for reward processing without signal distortion either by the skull/meninges or air-filled chambers within the skull. Yet only a handful of studies have exploited these advantages for studying reward processing in humans with the majority looking at each region piecemeal (Vanni-Mercier et al., 2009; Ramayya et al., 2015; Li et al., 2016; Saez et al., 2018; Gueguen, et al., 2021) rather than in tandem (Jenison, 2014; Zheng et al., 2017, 2019; Lopez-Persem et al., 2020). To date, intracranial data directly comparing human amygdala and OFC function and their connectivity in reward and motivation are limited (Jenison, 2014).
In this article, we report a novel human intracranial study of connectivity between core nodes of the reward system (amygdala, OFC, hippocampus) using a well-established neural and behavioral test of human motivation adopted from imaging to dissociate between the anticipation and receipt of rewards and losses [the monetary incentive delay (MID) task; Knutson and Greer, 2008]. This not only allowed us to assess the division of labor between regions but also their functional interactions using connectivity metrics to delineate key mechanisms for the routing of information via timing, frequency, phase, and amplitude (Lachaux et al., 1999; Salinas and Sejnowski, 2001; Varela et al., 2001; Fries, 2005, 2015; Siegel et al., 2012; Siems and Siegel, 2020). Different frequency bands have distinct underlying generators and functions. Low frequencies are generated by postsynaptic potentials across a broader spatial scale and orchestrate localized high gamma activity [high-frequency gamma (HFG); 60–250 Hz], which is closely related to single unit firing typically recorded in primates and involved in cognitive computations (Canolty and Knight, 2010; Buzsáki et al., 2012; Lachaux et al., 2012). Synchronization between regions in the gamma band is particularly important for functional integration (Fries, 2005, 2015; Fries et al., 2007; Lachaux et al., 2012). We identify amygdala centered networks which code and communicate reward and loss information through these frequency channels.
Materials and Methods
Patients
The study took place in the neurosurgical service of Ruijin Hospital, Shanghai JiaoTong University. Sixteen patients took part in total. All had severe treatment-refractory epilepsy and were undergoing stereotactic-EEG (SEEG) monitoring to locate the seizure onset zone for resection. For this reason, all patients were not taking anti-convulsive medicines at the time of testing. Patients were mostly female (N = 12), had a mean age of 28.5 (SD = 10.5) and were all right-handed. The average Montreal cognitive assessment (MOCA) score across patients was 25 (SD = 2.6). A total of nine patients had hippocampus electrodes, nine patients had OFC electrodes and 16 patients had amygdala electrodes. All OFC electrodes were implanted in a dorsal-ventral trajectory whereas amygdala and hippocampus electrodes were implanted in the lateral-medial trajectory. Each electrode had 16 2-mm contacts each separated by 1.5 mm. Testing took place after subjects had completed their clinical assessments for seizure localization. The ethics committee of Ruijin hospital, Shanghai JiaoTong University School of Medicine approved all procedures used. All patients provided written informed consent in accordance with the Declaration of Helsinki.
MID task
In the MID task, patients saw one of three distinctive cues which signaled whether a reward or loss could be received depending on the speed and correctness of a simple visual discrimination response (Fig. 1A). This task allows us to separate the period between the cue and response, when the subject anticipates the outcome, from when the outcome is received. On reward cue trials, a correct response led to a monetary reward, depicted by a 10 yuan note, and an incorrect or no response led to no reward/a neutral gray box. On loss cue trials, an incorrect or no response led to monetary loss, depicted by a 10 yuan note with a red cross overlaid, and a correct response led to no monetary loss/a neutral gray box. On neutral cue trials no money was won or lost and a gray box was shown regardless of correctness. To ensure equal numbers of trials across conditions and that patients were incentivized to respond quickly, the allotted time within which patients were allowed to respond started at 800 ms and was increased or decreased adaptively by 50 ms depending on the previous response being incorrect or correct, with a maximum time of 1000 ms and no fixed minimum. This means that roughly half of trials in each condition will be correct and incorrect/no response. Throughout the article, we define incorrect trials as both erroneous arrow classifications and nonresponses unless otherwise stated. The 500-ms cue was followed by the 2000-ms anticipation phase during which the colored square remained on the screen without the icon as a reminder of the type of outcome. A white arrow then appeared. Patients were instructed to press one of two buttons on a response box with their left or right thumb as quickly as possible in the direction of the arrow. This visual discrimination ensured patients maintained alertness and were attentive throughout the task. After the response, there was a blank screen of at least 500 ms followed by the outcome which was presented for 2000 ms. The blank interval was intended to give any response related activity time to dissipate so that it did not confound outcome activity. The total duration from arrow onset to outcome onset was always 1500 ms. The intertrial interval/fixation was 1500–2000 ms. Patients were told that they needed to respond as quickly as possible to either win or avoid losing money. There were 40 trials per condition completed after 18 practice trials. The task was programmed and run in MATLAB using Psychtoolbox 3.0 functions (Brainard, 1997). Stimuli were displayed on an LG L1954 monitor which has a width/height of 380 × 300 mm and a resolution of 1280 × 1024 pixels. The size of the cue/anticipation stimuli were 600 × 600 pixels. The arrows had a total length of 1000 pixels; the width/height of the shaft was 700 × 100 pixels. The words/money and gray outcomes were 730 × 600 pixels in size. The cumulative amount won/lost was presented in size 200 font below the money. Patients sat ∼75 cm away from the screen.
Reaction time analysis
Reaction times were normalized using logarithmic transformation, z-scored to facilitate comparison across patients and outliers above or below 2.5 SD from each patients mean were excluded from analyses. The reaction times of all correct response trials were analyzed with a linear mixed effects model implemented using the fitlme function in MATLAB with fixed effects factors of condition (reward, loss, neutral) and arrow direction and the random effects factor of subject. The accuracy rates were compared using χ2 tests. Critical p-value thresholds were derived using Bonferroni correction for the number of tests.
Electrode contact selection
The preimplant T1-weighted MRI and postimplant CT scans were transformed into MNI ICBM152 coordinates using affine co-registration (Ashburner and Friston, 2005) in Brainstorm (Tadel et al., 2011). The MNI coordinates of the tip and trajectory of each electrode shaft was used to locate the electrode contacts. For the amygdala and hippocampus, the reconstruction was overlayed on the subcortical ASEG atlas (Fischl et al., 2002) to verify which contacts were located within regions of interest. Additionally, the contact positions were assessed using the Harvard-Oxford cortical and subcortical atlas. Contacts located outside of regions of interest or in regions that were subsequently resected were excluded from analysis. All contacts used for analysis are shown in Figure 2A. For the amygdala and hippocampus, we used a maximum of the first three electrode contacts from the tip (two bipolar pairs per hemisphere; up to four signals per region) for analysis. This is because these contacts were most frequently and precisely located within these regions. As the OFC electrodes were implanted in a dorsal-ventral trajectory and because the cortex is around 3 mm thick, we found that only the first two electrode contacts were within the OFC. Therefore, there was only one bipolar pair for each OFC shaft. However, as the OFC is relatively large, the neurosurgeons sometimes implanted more than one electrode. In this case there was more than one electrode pair in one of the hemispheres.
Data preprocessing
SEEG data were recorded using a BrainAmp MR amplifier (Brain Products) with a 1000-Hz sample rate. In addition to the SEEG electrodes, we also recorded the electrooculogram (EOG) from electrodes placed above, below and beside the right eye. This allowed us to confirm that eye muscle activity from blinks and saccades did not contaminate the LFP data. The data were preprocessed and analyzed using MATLAB 2019b, FieldTrip (Oostenveld et al., 2011) and SPM (Litvak et al., 2011). Offline, the data were re-referenced using a bipolar montage by subtracting adjacent contacts, high-pass filtered at 1 hz and notch filtered at 50 hz and its harmonics using two-pass IIR Butterworth zero-phase lag filters to remove DC bias and powerline noise. The data were z-normalized to facilitate comparison across patients and visually inspected (blind to conditions) to remove epochs contaminated with artefactual activity. Across analyses, multiple contact pairs within regions of interest were averaged to increase the signal-to-noise ratio (SNR). Across all analyses, the number of trials in each condition were equalized across conditions and all trials with reaction times below 250 ms were excluded. After preprocessing, the average number of trials included in the analysis per condition was 19.5 (SD = 3.6) for the anticipation phase and 14.5 (SD = 2.2) for the outcome phase.
Analysis of local field potentials (LFPs)
We analyzed low and high-frequency activity separately as each reflects different neuronal mechanisms (Fig. 2B). As low-frequency activity is comprised of multiple frequency bands, we decomposed the signal into its constituent frequencies with time-frequency analysis. As HFG activity is a single broadband, we filtered the signal between 60 and 250 Hz and analyzed the envelope in the same manner as an event-related potential (ERP; Saez et al., 2018).
Time-frequency decomposition, low-frequency oscillations
Time-frequency decomposition was performed using multitaper convolution. For each trial, the data were windowed using a sliding time-window centered at 20-ms increments and tapered to reduce spectral leakage before calculating power. We analyzed logarithmically spaced frequencies between 2 and 32 Hz at 25 scales per octave using a single Hanning tapered time-window with a duration of six cycles. The time-frequency representations were averaged across conditions and baseline corrected by calculating percent signal change from −500 to 0 ms before the onset of the cue during the fixation period [(active – baseline)/baseline × 100]. The time-frequency decompositions were statistically analyzed at the group level using SPM12 (Wellcome Department of Imaging Neuroscience, Institute of Neurology, London, United Kingdom) which affords greater sensitivity compared with nonparametric methods (Kiebel et al., 2005). To meet the requirement for SPM analysis, that the data approximate a multivariate Gaussian distribution, the time-frequency images were square-root transformed and smoothed with a Gaussian kernel which had a full-width half-maximum (FWHM) of 12.5 log frequency units and 300 ms (Kilner et al., 2005). The size of this smoothing kernel was chosen based on the matched filter theorem (i.e., that the size of the smoothing kernel should approximate the size of the expected effects). We used 12.5 log frequency units as this was approximately the same width as the canonical frequency bands and 300-ms time units as this is similar to that previously used (Huebl et al., 2014). The time-frequency matrices were converted into images and entered into a second-level flexible factorial design to compare with paired t tests. We used a cluster-forming threshold of p = 0.05 but only report clusters as significant if the random field theory (RFT) corrected cluster-level significance exceeded a Bonferroni–Holm correction for six contrasts (comparing reward and loss with neutral across three regions). Instead of isolating each low-frequency band a priori, i.e., delta (2–4 Hz), theta (4–8 Hz), alpha (8–12 Hz), and beta (12–30 Hz), our analysis was data driven, meaning we could identify the timing and frequency of neural activity patterns without prior assumptions while controlling for multiple comparisons. However, as activity patterns tend to cluster approximately to these frequency bands, we use this nomenclature to refer to specific effects.
HFG amplitude-envelope analysis
To assess HFG modulation, we bandpass filtered the signal between 60 and 250 Hz using two-pass IIR zero phase lag Butterworth filters and extracted the instantaneous amplitude/envelope by taking the absolute value of the Hilbert transform. The data were smoothed with a 200-ms moving average sliding window. After trials were averaged across conditions, the data were baseline corrected by subtracting the mean activity between −500 and 0 ms before cue onset. We chose the 60- to 250-Hz range as this encompasses the full range of frequencies that have variously been referred to as HFG in the literature while avoiding 50-Hz line noise. HFG activity was analyzed at each time point using two-sided paired-samples t tests with a threshold of p = 0.05 and corrected for multiple comparisons across time and channels using nonparametric cluster-based permutation tests with 10,000 permutations (Maris and Oostenveld, 2007). The permutation test works by repeatedly permuting condition labels, performing a t test and extracting the significance of the largest cluster thousands of times to build a significance distribution with which the cluster statistics from the nonpermuted t test can be evaluated against. To correct for multiple channels, the determination of the maximum cluster in each permutation iteration is found across all channels. There were 16 patients with amygdala electrodes and nine patients with OFC or hippocampus electrodes. All patients with OFC or hippocampus electrodes had corresponding amygdala electrodes. Therefore, to correct for the number of channels in the analysis of OFC and hippocampus, we equalized the number of amygdala patients to nine. To avoid selection bias, we repeated the tests twice; once using the amygdala electrodes from the patients that also had OFC electrodes for the assessment of OFC significance, and once using the amygdala electrodes from the patients that also had hippocampal electrodes for assessment of hippocampal significance. We only report effects as significant if they exceeded a Bonferroni–Holm correction for the number of contrasts (which was four). For the analysis of the amygdala, we analyzed all 16 patients separately to the OFC and hippocampus as there were no such corresponding channels to control for seven patients. Therefore, for the amygdala, we used a more stringent Bonferroni–Holm correction including the number of channels and contrasts (which was 12 in total). We limited our analysis to between 0 and 1 s after stimulus onset based a priori on the nonhuman primate literature which have shown single cell action potentials and LFPs recorded from the amygdala and OFC to reward/punishment peaks within this time window (Padoa-Schioppa and Assad, 2006; Paton et al., 2006; Belova et al., 2008; Morrison and Salzman, 2009; Bermudez and Schultz, 2010; Morrison et al., 2011; Jezzini and Padoa-Schioppa, 2020). As HFG activity is believed to reflect population level multiunit activity, we expected it would show similar modulation to rewards and punishments as shown in these nonhuman primate studies (Buzsáki, et al., 2012; Lachaux et al., 2012).
Functional connectivity
Two distinct types of functional connectivity between regions were assessed: phase-locking statistics (PLV) and amplitude correlation, which capture different mechanisms of interaction (Siems and Siegel, 2020). Two oscillations can align their phases without changing in amplitude or may show amplitude comodulations while their phase-relations remain random. Phase-locking is better suited for understanding interactions between regions that occur in close proximity, such as between amygdala and hippocampus, or at low frequencies which are generated by larger populations of neurons across a broader area. In contrast, amplitude correlation is better suited for longer range connections, such as between amygdala and OFC, or where activity is more discretely localized, as in the case of HFG generators (Buzsáki, et al., 2012; Lachaux et al., 2012; Siegel et al., 2012; Siems and Siegel, 2020). Granger prediction was used to test for directionality. However, as no significant effects were found, we do not discuss this analysis further. Functional connectivity was visualized using Surf Ice (https://www.nitrc.org/projects/surfice/). There were eight patients with both amygdala and OFC electrodes and seven patients with both amygdala and hippocampus electrodes. Connectivity between the hippocampus and OFC was not assessed because of limited numbers of patients with both electrodes (n = 4). We examined connectivity between regions within each hemisphere. Our preprocessing pipeline ensured the data were conditioned in a way that was amenable to accurate connectivity analysis. The bipolar re-referencing scheme ensured that the different regions did not share a common reference and volume conduction, both of which can cause spurious correlations between channels. The number of trials in each condition was also equalized as connectivity analysis is sensitive to sample-size bias. We also report χ2 statistics comparing the number of electrode pairs showing increased and decreased connectivity and plot the correlation in amplitude between regions across trials for significant time/frequency windows.
Phase-locking values
Phase-locking analysis (PLV/PLS) was used to test for synchronization between regions. PLV is a measure of the consistency of the phase θ differences between two channels (x and y) at a particular time t and frequency f across trials, regardless of absolute phase and amplitude (Lachaux et al., 1999):
The resulting coefficient is bound between 0 and 1 indicating the strength of PLV. PLV was calculated using the same time-frequency decomposition parameters as used for the low-frequency analysis and statistically evaluated with SPM. Before analysis, contrasts of interest were subtracted and the resulting subtraction image was Fisher Z-transformed to approximate a normal distribution for parametric analysis with a one-sample t test. We opted for phase-locking over coherence as a measure of functional connectivity as coherence conflates phase with amplitude correlation and PLV is sufficient on its own to demonstrate functional connectivity. We also looked at phase-locking between signals in the HFG range. PLV was estimated in 250-ms-wide windows centered every 10 ms between 60 and 250 Hz in 4-Hz intervals with 10-Hz frequency smoothing applied using DPSS tapers. Subsequently, the data were averaged over the frequency dimension and smoothed with a 200-ms sliding average (all effects were significant without smoothing also). HFG range PLV was statistically analyzed using the same time-series permutation methods as HFG amplitude. We only report effects as significant if they surpassed a Bonferroni–Holm corrected threshold determined by the number of connections (amygdala-OFC, amygdala-hippocampus) and contrasts tested which was four for the cue phase and eight for the outcome phase.
Amplitude-envelope correlation (AEC)
To examine nonphase-locked connectivity in the HFG range, we used AEC using the HFG band envelope described above. We computed the spearman correlation coefficient between each electrode pair (across time) in 500-ms time windows centered every millisecond. The Fisher Z-transformed HFG AEC was statistically analyzed using the same time-series cluster-based nonparametric permutation method as HFG amplitude and PLV. Spearman correlation was also performed on the square root transformed power estimates in the low frequency decomposition (across trials) and analyzed with SPM after Fisher Z-transformation. Corrections for multiple comparisons was the same as for low-frequency power and PLV.
Results
Behavior
Using a p-value threshold of 0.0167, derived by Bonferroni correcting for the three tests performed, reaction times were significantly faster on both reward (M = –0.1029, SEM = 0.048; t(783) = 4.5, p < 0.0001, CIs [0.16 0.42], one-tailed) and loss (M = −0.1065, SEM = 0.04; t(763) = 4.3, p < 0.0001, CIs [0.2 0.4], one-tailed) trials relative to neutral trials (M = 0.1942, SEM = 0.059). There was no significant difference in reaction times between reward and loss trials (t(859) = 0.09, p = 0.9, CIs [–0.06 0.07]; Fig. 1B). Patients were accurate on approximately half of trials (reward M = 58.8%, SD = 4.7%; loss M = 55.6%, SD = 4.4%; neutral M = 52.8%, SD = 8%) and the numbers of incorrect arrow direction classifications was very low (∼2%; 11 patients had at least one condition with no such trials), showing that they performed the task well (Table 1). Therefore, the vast majority of incorrect trials were nonresponses rather than erroneous button presses. In concordance with the reaction times, there were significantly more correct trials than incorrect trials for reward (χ2(1) = 24.8, p < 0.0001) and loss (χ2(1) = 12, p = 0.001) but not neutral (χ2(1) = 2.5, p = 0.114). A direct comparison of the ratio of correct to incorrect trials between conditions showed a significant difference between reward and neutral (χ2(1) = 5.9, p = 0.015) but not loss and neutral (χ2(1) = 1.8, p = 0.178) or reward and loss (χ2(1) = 1.16, p = 0.28). The faster reaction times on reward and loss trials compared with neutral indicates that they had an influence on motivational responding.
Neurophysiology
For the analysis of LFPs in the anticipation phase of the task, we compared all trials on which patients responded regardless of correctness. This ensured that all trials were those that the patient was most strongly motivated to respond. For the analysis of reward and loss receipt, we compared reward outcomes (reward correct) with neutral correct and loss outcomes (loss incorrect) with neutral incorrect which allowed us to control for the correctness of responses. In the interval between response and outcome onset, it is possible patients may be able to know what type of outcome they will receive depending on the cue type and correctness of response. To rule this out, we also performed response-locked analyses. The vast majority of the effects we report (all but one) were not time-locked to the response. Low frequencies and HFG were analyzed separately and corrected for multiple comparisons (Fig. 2B).
We first analyzed time-frequency resolved activity and connectivity patterns during the anticipation phase. There were no significant differences between conditions in low-frequency activity in any region. However, there was a significant increase in HFG activity in the hippocampus in response to reward relative to neutral (t(9) > 2.3, p = 0.0017; Fig. 3A). All patients showed increased activity on reward trials relative to neutral trials within this significant time period (Fig. 3D). No significant differences between conditions emerged in the amygdala or OFC (Fig. 3B,C). Opposing patterns of findings would be expected from regions involved in value-prediction and motivation. A region involved in value prediction would not respond to the cue as it does not accurately predict outcome value because cues are only associated with reward and loss on approximately half of trials because of the RT interval staircasing. On the other hand, an area involved in motivation would increase anticipatory activity when the opportunity to obtain reward is more sporadic to drive the organism to expend more effort to obtain rewards that are rarer or more difficult to obtain. The hippocampus fit this pattern as it showed increased HFG activity during reward anticipation unlike the amygdala and OFC which appear to be more important for value coding. This is highly consistent with a previous study showing larger ERP deflections in the hippocampus to rewards that were more uncertain (Vanni-Mercier et al., 2009).
Phase-locking in the theta band between the amygdala and OFC was also increased for the reward condition relative to neutral in the anticipation phase (t(1,14) > 1.76, p < 0.001; Fig. 4A,B). Eighteen electrode pairs showed an increase in PLV on reward trials relative to neutral, and two pairs showed a decrease (χ2(1) = 12.8, p < 0.001; Fig. 4C). There was no significant correlation between the difference in PLV strength on reward compared with neutral trials and laterality (r = 0.008, p = 0.97) or anterior-posterior (r = 0.05, p = 0.9) location of OFC electrodes. These findings demonstrate an important role for hippocampal HFG activity and amygdala-OFC theta synchronization in reward anticipation. Anticipation by definition involves retrieval or maintenance of memories of rewards received on previous trials. The theta rhythm is known to be involved in working memory (Lisman and Jensen, 2013). Hippocampal HFG activity and theta band synchronization is consistent with their role in mnemonic processes.
We then turned our attention to time-frequency specific activity and connectivity patterns in the receipt phase. Similar to the anticipation phase, in the hippocampus, there was increased HFG activity to reward compared with neutral (t(15) > 2.3, p = 0.011; Fig. 5E). Eight out of nine patients showed this pattern (χ2(1) = 5.4, p = 0.02; Fig. 5F). Increased HFG on reward trials was accompanied by alpha band suppression (t(1,48) > 1.68, p < 0.001; Fig. 6A,B,D) in the same number of patients which is known to have an antagonistic relationship with HFG (Staresina et al., 2016), as the inhibitory phase of alpha is believed to suppress gamma, reductions in alpha increase gamma. There was also an increase in delta for reward (t(1,48) > 1.68, p < 0.001) compared with neutral which was consistent in eight out of nine patients (χ2(1) = 5.4, p = 0.02; Fig. 6A,C,E). However, this was not specific to reward as there was also an increase in delta for loss (t(1,48) > 1.68, p < 0.001) which was consistent in seven (χ2(1) = 2.8, p = 0.096) out of nine patients. The selectivity for reward in hippocampal HFG activity in both the anticipation and the outcome phase of the task and the known role of the hippocampus in memory processes may suggest that a representation of the reward outcome is being stored in the outcome phase and reinstated during the anticipation phase. This is highly consistent with recent findings of reward dedicated cells in the hippocampus (Gauthier and Tank, 2018). An important outstanding question is where this reward signal may originate from.
In the amygdala, there was also a significant increase in HFG to reward compared with neutral (t(15) > 2.1, p = 0.003) as well as loss compared with neutral (t(15) > 2.1, p = 0.0018; Fig. 5A) which was consistent in 13 (χ2(1) = 6.25, p = 0.012) and 14 (χ2(1) = 9, p = 0.003) patients out of 16, respectively (Fig. 5A,B). As HFG is believed to reflect population level spiking activity, the patterns of HFG responses seen in the amygdala are consistent with primate studies showing single amygdala neurons encoding positive and negative value (Paton et al., 2006; Belova et al., 2008; Bermudez and Schultz, 2010; Jezzini and Padoa-Schioppa, 2020). As this single-unit activity has been shown to be greater to reward and punishment in more medial nuclei (Zhang et al., 2013), we tested the hypothesis that these HFG responses were lateralized in the same manner in humans. The observed amygdala activity patterns were subtracted between conditions, averaged within significant time windows and correlated with MNI coordinates. A significance threshold of p = 0.0125 was set by Bonferroni correcting for the number of contrasts (reward/loss) and hemispheres (right/left). The difference in HFG between loss and neutral in the right amygdala was significantly correlated with contact laterality, with increased responses in more medial contacts (r = 0.6, p = 0.005, one-tailed; Fig. 5G–I). No correlation was seen in the left amygdala (r = 0.2, p = 0.4) or when the left and right amygdala were combined (r = 0.11, p = 0.5). However, there was a more restricted distribution of electrode positions sampled in the left amygdala (MNI X coordinate SD = 2.6 mm) compared with the right amygdala (SD = 3.4 mm). This may be why no significant correlations were found on the left. In the amygdala, there was also a significant increase in delta/theta activity, around 4 Hz, for loss compared with neutral (t(1,90) > 1.66, p < 0.001; Fig. 6F–H) which was consistent in 12 out of 16 patients (χ2(1) = 4, p = 0.046). The selectivity for loss in this specific frequency response may relate it to the known role of the amygdala in loss aversion (De Martino et al., 2010).
In addition to both amygdala and hippocampus HFG being modulated by reward independently, the two signals showed significant synchronization as there was increased HFG phase locking between amygdala and hippocampus on reward compared with neutral trials (t(6) = 2.5, p < 0.0001), with 20 electrode pairs showing an increase in phase-locking and seven a decrease (χ2(1) = 6.26, p = 0.012; Fig. 7A). This HFG cross talk between amygdala and hippocampus may be the way in which reward value may prioritize memory storage for reinstatement during the anticipation phase. There was also significantly increased delta phase-locking between amygdala and hippocampus during receipt of rewards compared with neutral (t(1,12) > 1.78, p < 0.001). Twenty-three electrode pairs showed increased phase-locking whereas four pairs showed a decrease (χ2(1) = 13.37, p < 0.001; Fig. 7D–F). A response locked analysis showed that delta PLV was also increased after the response on reward trials compared with neutral (t(6) = 1.94, p < 0.001). Twenty-two electrode pairs showed increased phase locking whereas five decreased (χ2(1) = 10.7, p = 0.001; Fig. 7G–I). This suggests that this signal may also be involved in inferring reward receipt based on knowledge of response correctness and cue type. This finding is consistent with the known role of delta in motivational function (Wu et al., 2018) and the tight link between motivation and action. However, no other effects in the outcome phase were significantly locked to the response.
There was also a suppression of correlation between amygdala and hippocampus theta amplitude on loss relative to neutral trials (t(1,12) > 1.78, p < 0.001; Fig. 8A,B). Twenty-five electrode pairs showed decreased correlation on loss relative to neutral trials whereas two electrode pairs showed an increase (χ2(1) = 19.6, p < 0.001; Fig. 8C). After averaging theta amplitude within this significant time-window, we found the trial-by-trial spearman correlation between amygdala and hippocampus was significantly larger on neutral trials relative to loss trials using all trials across all patients (r difference = –0.32, z = −4.5, p < 0.0001; Fig. 8D). All patients showed more positive correlations on neutral trials relative to loss trials. Overall, these findings demonstrate increased coupling between amygdala and hippocampus for reward and decreased coupling for loss. The findings suggest that the amygdala may be one of the origins of the reward signals observed in hippocampus.
Like the amygdala, the OFC also showed increases in HFG activity for reward and loss. However, there were no significant differences between reward and neutral or loss and neutral. Although this contrast was of primary interest, we also compared reward outcomes (reward correct) and loss outcomes (loss incorrect) with neutral outcomes from the reward incorrect condition and loss correct condition based on the hypothesis that the OFC could be involved in comparing outcome values in the context of the cue across trials (Rudebeck and Murray, 2008; Kennerley et al., 2011; Riceberg and Shapiro, 2012; Saez et al., 2017, 2018). Indeed, there was a significant increase in HFG when patients received a reward for being correct compared with when they did not receive a reward for being incorrect (t(8) > 2.3, p = 0.018). Likewise, there was increased HFG when patients received a loss for being incorrect compared with when they did not receive a loss for being correct (t(8) > 2.3, p = 0.015; Fig. 5C,D). In both cases, seven out of nine patients showed consistent differences (χ2(1) = 2.8, p = 0.096). This was not driven by the correctness of the response as the direction of differences between correct and incorrect was opposite for reward and loss and a control analysis comparing correct and incorrect neutral trials showed no significant differences. There were no significant correlations between HFG responses and laterality or anterior-posterior (all ps >0.7) location of OFC electrodes. In the OFC, there was also significant late theta band suppression for both reward and loss compared with neutral (t(1,48) > 1.68, p < 0.001; Fig. 6I–K) which was consistent in nine and seven (χ2(1) = 0.28, p = 0.096) out of nine patients, respectively. Theta activity in OFC and connectivity with other regions (such as hippocampus) is known to peak in response to major events in reward tasks (Knudsen and Wallis, 2020), and the difference between conditions appears to be the waning of this transient increase to reward and loss.
We then asked whether the HFG modulations seen in both amygdala and OFC during the outcome phase were functionally related. There was no significant phase-locking between amygdala and OFC in this phase. However, there were two significant differences in HFG AEC, which is a connectivity metric better suited for detecting longer range connections (Fig. 9). The first effect was a significant increase on loss incorrect trials relative to neutral incorrect trials (t(7) > 2.37, p = 0.004, gray horizontal bar; Fig. 9A,B,D). Nineteen electrode pairs showed an increase in correlation and 1 pair showed a decrease (χ2(1) = 16.2, p < 0.001). The second effect was a significant increase on loss incorrect trials relative to loss correct trials (t(7) > 2.37, p = 0.006, black horizontal bar; Fig. 9A–C). Fifteen electrode pairs showed an increase in correlation and five electrode pairs showed a decrease (χ2(1) = 5, p = 0.025). A more fine-grained trial-by-trial analysis averaging HFG activity on each trial within the significant time windows and then correlating both regions across trials showed that there was significantly increased correlation on loss incorrect trials compared with loss correct trials (r difference = 0.23, z = 2.09, p = 0.018, one-tailed) and neutral incorrect trials (r difference = 0.21, z = 1.9, p = 0.03, one-tailed; Fig. 9B). On the grounds of the loss-specific amygdala-OFC coupling and its potential role in adjusting behavior based on negative feedback (Chau et al., 2015; Rudebeck et al., 2017a), we re-analyzed the reaction times to see whether there were corresponding changes in behavior on subsequent trials using linear mixed effects models with fixed effects factors of condition (reward, loss, neutral) and arrow direction (left/right) and random effects factors of subject. For this analysis, the critical p-value threshold of 0.025 was derived by Bonferroni correcting for the number of contrasts (which was two: reward correct vs neutral correct and loss incorrect vs neutral incorrect). This analysis showed that reaction times were slower on trials immediately after loss incorrect trials (M = 0.1, SEM = 0.05) compared with after neutral incorrect trials (M = –0.1, SEM = 0.07; t(397) = 2, p = 0.023, one-tailed, CIs [0.003 0.35]). In contrast, there was no significant difference in reaction times after reward correct (M = 0.11, SEM = 0.1) compared with after neutral correct (M = 0.09, SEM = 0.07; t(393) = 0.4, p = 0.73, one-tailed, CIs [–0.16 0.23]; Fig. 9E; Table 1). This suggests that patients adjusted their behavior after losses by responding more cautiously. In summary, the amygdala-OFC connectivity findings are a significant advance from previous studies which were limited by methodological issues including small sample size, a common reference, volume conduction and lack of an appropriate baseline (Jenison, 2014).
Like HFG activity, we did not find any significant correlation between the difference in connectivity strength on loss incorrect and neutral incorrect trials and laterality (r = –0.1, p = 0.6) or anterior-posterior (r = –0.3, p = 0.2) location of OFC electrodes. There was also no significant correlation between the difference in connectivity strength on loss incorrect and loss correct trials and laterality (r = 0.3, p = 0.3) or anterior-posterior (r = 0.2, p = 0.5) location of OFC electrodes. In contrast, previous fiber tracing studies in macaques have shown that the majority of axons from the amygdala to the OFC terminate in its posterior part (Carmichael and Price, 1995, 1996). There are several reasons for this discrepancy. First, there may be species differences. Second, there are differences between functional and anatomic connectivity. It is possible that the functional connectivity findings are mediated by poly-synaptic connections. Unfortunately, as we only sampled OFC activity from a small number of electrodes, we cannot determine this and our findings regarding anatomic specificity are not definitive. In this regard, the ability to acquire activity from the whole OFC and beyond with fMRI may have some advantages for determining regional connectivity patterns (Kahnt et al., 2012). However, our findings are consistent with both primate and human ECoG studies of reward processing in the HFG range which had much better coverage of the OFC. In humans, HFG reward signals were found isotropically throughout the OFC (including areas 10, 11, 13, 14, and 12/47; Saez et al., 2018) and in primates, HFG reward signals were actually stronger in the anterior OFC and single neuron activity showed no strong regional differences (Rich and Wallis, 2017).
We also did not detect any directionality with regards to our connectivity findings. This could be because the interactions were bidirectional. However, while we were able to detect important activity and functional connectivity patterns, we cannot rule out that effective connectivity may be detected with a larger number of trials, as required by granger causality analysis. Testing time restraints are one of the limitations of working with patients. It could also be the case that the amygdala and OFC are exchanging information back and forth at a rate fast enough that it cancels out in the large time-windows necessary to calculate granger, particularly in the high-frequency range. In this regard, paradigms with more longer duration manipulations may be more compatible with this analysis. A previous study in primates benefitting from thousands of trials and electrode pairs found that amygdala-OFC connectivity between 5 and 100 Hz was increased to conditioned stimuli during learning whereas this was reduced postlearning (Morrison et al., 2011). As our task was nondeterministic, it is difficult to parse when learning is occurring or not and so the direction of influence may be less clear.
Discussion
We used the unique opportunity to record LFPs directly from human reward circuits in epilepsy patients to identify oscillatory and connectivity dynamics involved in reward and loss processing that until recently were only tractable in animals. We used a well-established measure of human reward processing from imaging (Knutson and Greer, 2008) to show that the amygdala, OFC and hippocampus participate individually and collectively in different aspects of reward and loss processing. The incentive value of the rewards and losses had a powerful impact on behavior, eliciting faster responses than neutral. Corresponding changes were seen in the brain. The key findings were that amygdala and OFC HFG was involved in processing both reward and loss value whereas the hippocampus was more selectively involved in reward. Connectivity patterns between regions were further dissociated by valence. Amygdala-OFC theta and amygdala-hippocampus delta/HFG phase-locking encoded reward anticipation and receipt, respectively, whereas amygdala-OFC HFG and amygdala-hippocampus theta correlation encoded loss receipt (Fig. 10, summary).
The patterns of amygdala and OFC HFG responses to reward and loss receipt were highly consistent with previous primate studies showing their role in conditioned and unconditioned reward and punishment (Paton et al., 2006; Padoa-Schioppa and Assad, 2006; Belova et al., 2008; Morrison and Salzman, 2009; Bermudez and Schultz, 2010; Morrison et al., 2011; Jezzini and Padoa-Schioppa, 2020). Both amygdala and OFC have “positive” and “negative” value coding cells intermixed which is consistent with our finding of HFG responses to both rewards and losses. The HFG response to loss was also larger on more medial contacts in the right amygdala. This is consistent with a primate study which found value coding cells were more medial (Zhang et al., 2013) and compatible with current understanding of the role of different amygdala subnuclei in aversive conditioning. Processing of conditioned stimuli proceeds from the lateral to medial nuclei which represent the sensory input and response output regions, respectively (Janak and Tye, 2015).
Compared with the amygdala, the OFC showed a more complex pattern of responses. Like the amygdala, the OFC showed HFG increases to reward correct and loss incorrect receipt but significant differences only emerged when contrasted with reward incorrect and loss correct. In other words, differences between reward, loss and neutral only emerged when a reward or loss outcome was cued and received compared with cued and not received. This is consistent with the response patterns of OFC neurons. The OFC has been proposed to encode the identity of specific outcomes and their current value allowing stimuli to be valued in a common currency, compared, and chosen between (Rudebeck and Murray, 2014). Previous studies have shown that OFC carries information about rewards received or lost on previous trials into future trials to guide actions (Rudebeck and Murray, 2008; Kennerley et al., 2011; Riceberg and Shapiro, 2012; Saez et al., 2017, 2018). Similarly, we propose that the OFC is comparing the value of outcomes within the reward and loss conditions across trials instead of representing the immediate value regardless of its history or cue context. Such reference dependent coding may be partly driven by a suppression of reward and loss coding cells when omitted.
Crucially, we demonstrate for the first time in humans, loss specific HFG functional connectivity between the amygdala and OFC. A causal role for amygdala-OFC interactions in supporting primate reward behaviors has been established by severing the reciprocal anatomic connections between the two which impairs the ability to update the value of conditioned stimuli that have been devalued (Baxter et al., 2000; Fiuzat et al., 2017) and excitotoxic, fiber-sparing lesions of the amygdala which suppress OFC activity during reward learning and receipt (Rudebeck et al., 2013). Selective connectivity between amygdala and OFC for loss links what we know about damage to each region individually. Amygdala damage impairs loss aversion (De Martino et al., 2010) and OFC damage results in failure to use negative feedback to guide behavior (Wheeler and Fellows, 2008). Amygdala-OFC connectivity may reflect negative value in the amygdala being coordinated with the OFC where it can guide future behavior. In support, patients were more cautious in their responses after receiving a loss. This is consistent with primate studies where lose-shift behavior increased OFC-amygdala connectivity (Chau et al., 2015) and amygdala lesions impaired learning from negative feedback and reduced the number of OFC neurons encoding reward associations (Rudebeck et al., 2017a). The connectivity pattern was nonphase-locked, which may be because of the larger distance between assemblies or the functions in each region needing to be distinct but temporally coordinated.
We also found increased amygdala-OFC theta synchronization in the latter part of the anticipation phase. This bears a striking resemblance to a recent study which showed theta phase-locking between OFC and hippocampus when anticipating reward (Knudsen and Wallis, 2020). Furthermore, theta peak phase-locked closed-looped micro-stimulation disrupted both connectivity and decision-making performance. The hippocampus was proposed to provide a cognitive map of task space for use by OFC where it is combined with value. It is highly plausible that the amygdala is also involved in this process as theta oscillations are generated by larger populations of neurons and have been proposed to coordinate large scale networks, particularly between prefrontal and medial temporal lobes (Helfrich and Knight, 2016). In both studies, the tasks were nondeterministic which may lead to representations of value that are maintained in working memory rather than stored as long-term stimulus-reward associations which would be consistent with the known role of theta in working memory (Lisman and Jensen, 2013). While stimuli, task, value, etc. maybe maintained by theta rhythms in each region, brief periods of synchrony may bind these attributes together to guide an impending response.
Hippocampal HFG activity was specifically involved in reward. This is highly consistent with recent evidence of dedicated reward coding cells in the hippocampus (Gauthier and Tank, 2018). Anticipatory hippocampal HFG reward activity may arise through connectivity with regions such as the nucleus accumbens (Nacc) which is involved in appetitive motivation during the cue phase of the MID task (Wu et al., 2018) and interconnected with the hippocampus to mediate contextualized reward behavior like conditioned place preference (Ito et al., 2008; LeGates et al., 2018). Reward selectivity in hippocampus HFG activity was also found in the outcome and was accompanied by alpha desynchronization. Suppression of alpha has been suggested to index regional activation via the inhibitory phase of alpha suppressing HFG (Jensen and Mazaheri, 2010). This HFG activity showed phase-locking with the amygdala, a mechanism which may gate the transfer of reward information into memory through attentional selection and binding (Fries, 2005) thereby allowing it to be reinstated during the cue in concert with motivational processes. Reward related LFPs recorded from Nacc in mice and humans in the MID task have been shown in the delta range, which we found in terms of hippocampus amplitude and phase synchronization with amygdala (Wu et al., 2018). Further evidence of reward selectivity was shown by the suppression of amygdala and hippocampus theta amplitude coupling for loss.
The identification of oscillations communicating reward information could act as biomarkers for psychiatric symptoms. For example, the amygdala-hippocampus HFG circuit might be underactive and the amygdala-OFC HFG circuit overactive in depression whereas the opposite pattern may contribute to addiction. These connections could be manipulated with stimulation, to examine their causal role and potential therapeutic benefit. This is important because reward and punishment coding cells in the amygdala and OFC are intermixed making it difficult to exclusively modulate one population (Zhang et al., 2013; Rich and Wallis, 2014). As connectivity patterns are valence specific, upregulating and downregulating connectivity may be more effective at exclusively modulating reward or punishment processing.
Despite the many advantages of intracranial recordings, it is necessary to bear in mind the limitations. There may be differences between the epileptic and normal brain. However, we removed any electrodes implanted in areas that were subsequently resected or artefactual and patients showed the same behavioral results as normal participants. The number of regions recorded from are also limited and we have no control over the exact positions of the electrodes which is decided by clinicians. This makes it difficult to know whether other regions mediate connectivity and complicates inferences about spatial specificity. There are also time constraints with patients which means we cannot run a large amount of trials. This makes it difficult for some analyses such as granger causality.
In conclusion, by applying a well validated measure of reward processing to a unique cohort of epilepsy patients we have been able to uncover temporal, spectral and connectivity dynamics from key regions of the human reward circuit and dissociate some of their key functions. This has allowed us to confirm and extend findings from research that has largely been confined to nonhuman primates and rodents. We hope these findings will contribute to a mechanistic understanding of the human reward circuit and allow for predictions about how variations in function may underlie aspects of mental disorder and equally, alleviation via neuromodulation.
Footnotes
Acknowledgements: We thank all the patients for taking part. This work was supported by the Natural Science Foundation of China Grant 81771482 (to B.S.), the Shanghai Jiao Tong University Trans-med Awards Research Grant 2019015 (to B.S.), the Shanghai Clinical Research Center for Mental Health Grant 19MC191100 (to B.S.), and the Medical Research Council Senior Clinical Fellowship MR/P008747/1 (to V.V.).
The authors declare no competing financial interests.
- Correspondence should be addressed to Shikun Zhan at shikun_zhan{at}163.com, Bomin Sun at bomin_sun{at}163.com, or Valerie Voon at vv247{at}cam.ac.uk