Abstract
How do we evaluate whether someone will make a good friend or collaborative peer? A hallmark of human cognition is the ability to make adaptive decisions based on information garnered from limited prior experiences. Using an interactive social task measuring adaptive choice (deciding who to reengage or avoid) in male and female participants, we find the hippocampus supports value-based social choices following single-shot learning. These adaptive choices elicited a suppression signal in the hippocampus, revealing sensitivity for the subjective perception of a person and how well they treat you during choice. The extent to which the hippocampus was suppressed was associated with flexibly interacting with prior generous individuals and avoiding selfish individuals. Further, we found that hippocampal signals during decision-making were related to subsequent memory for a person and the offer they made before. Consistent with the hippocampus leveraging previously executed choices to solidify a reliable neural signature for future adaptive behavior, we also observed a later hippocampal enhancement. These findings highlight the hippocampus playing a multifaceted role in socially adaptive learning.
SIGNIFICANCE STATEMENT Adaptively navigating social interactions requires an integration of prior experiences with information gleaned from the current environment. While most research has focused on striatal-based feedback learning, open questions remain regarding the role of hippocampal-based episodic memory systems. Here, we show that during social decisions based on prior experience, hippocampal suppression signals were sensitive to adaptive choice, while hippocampal enhancements was related to subsequent memory for the original social interaction. These findings highlight the hippocampus playing a multifaceted role in socially adaptive learning.
Introduction
Humans expertly navigate through dynamic social worlds despite the sheer amount of information they are bombarded with. Although another's motivations are largely hidden from us, we can make socially adaptive decisions, such as who to cooperate with or trust (FeldmanHall and Shenhav, 2019). Such success requires an efficient integration of prior experiences with information gleaned from the current environment. Classic models of decision-making suggest that through repeated experience, humans incrementally fine-tune their behavior using prediction errors (Montague and Berns, 2002; King-Casas et al., 2008; Gläscher et al., 2009, 2010), which enables us to learn who to approach and who to avoid. However, we can also learn and make adaptive decisions from relatively limited experience. Indeed, a hallmark of human cognition is that complex concepts can be learned from a single experience (Lake et al., 2015).
A growing body of research shows that individuals routinely make judgements based on limited prior experience. Even briefly glancing at a person's face can provide enough information to judge whether that person can be trusted (Engell et al., 2007; Mende-Siedlecki et al., 2013; Todorov and Mende-Siedlecki, 2013). Thus, even when information is dynamic and multidimensional, and involves moral qualities, humans are highly adept at encoding relevant information from a single brief exposure. Less is known, however, about how people retrieve this information to adaptively decide whether to reengage or avoid a particular individual. Our group showed that intact detailed, episodic memories of the prior exchange may be a necessary requirement (Murty et al., 2016; Schaper et al., 2019). This suggests that making flexibly adaptive choices from limited experience necessitates the recollection of contextual details from the original social encounter.
Despite this behavioral evidence, the neural mechanisms that instantiate socially adaptive single-shot learning remain unknown. There are two competing theories (Ghiglieri et al., 2011; Woolley et al., 2013). On the one hand, value-based learning is canonically considered to be in the domain of the striatum, for both multitrial nonsocial learning (O'Doherty et al., 2003; Hare et al., 2008; Diederen et al., 2016; Bornstein and Norman, 2017) and social learning (Hackel et al., 2015). On the other hand, the hippocampus, a region known for its central role in long-term episodic memory (Davachi, 2006; Eichenbaum et al., 2007) may instead be recruited, which would mirror the functional role of this region in memory retrieval, spatial learning, and cognitive maps (Schapiro et al., 2013, 2016; Kaplan et al., 2017a,b; Nau et al., 2018; Omer et al., 2018). Indeed, prior research shows that the hippocampus prioritizes the encoding of valuable everyday items and the contexts in which they are encountered (Wittmann et al., 2005; Adcock et al., 2006; Murty and Adcock, 2014).
By focusing on the hippocampus and striatum, we can identify the role of these distinct learning systems during the instantiation of an adaptive social choice informed by a single prior social interaction. We hypothesized that the hippocampus would play an outsized role in supporting socially adaptive choices from just one learning episode. We collected fMRI data during a social decision-making task (Murty et al., 2016), in which participants first played an interactive game where a series of people either offered fair or unfair monetary splits in a Dictator Game (DG; Fig. 1A). After a delay, subjects indicated which of these people they would prefer to interact with in a subsequent Dictator Game. Finally, participants completed a surprise memory test to probe whether individuals' episodic memory for the initial exposure was intact. This design allowed us to test whether such adaptive decisions to reengage with fair individuals and avoid unfair individuals recruits a hippocampal-dependent learning system rather than a striatal-dependent learning system.
Materials and Methods
Subjects
We scanned 28 healthy, right-handed participants to yield a sample of at least 20 participants after removing participants for lack of behavioral variance. Sample size was determined by existing work using the same paradigm and behavioral analysis pipeline (Murty et al., 2016). Eight participants were excluded from analyses because of computer malfunctions during retrieval (N = 2); failure to show any variability in choice behavior (same choice selected throughout the task; N = 5); and failure to believe that they were playing with other real partners during the task (N = 1). This led to a final sample of 20 participants (median age = 23 years; age range = 18–34 years; 10 females). Participants provided written consent, and the experiment was approved by the New York University Committee on Activities Involving Human Subjects. All subjects were paid $25/h and could make up to an additional $10 based on their decisions during the task.
Stimuli set
The stimuli used in the DG and subsequent Decision Task, were taken from pictures of white male faces approximately between the ages of 18 and 24 years (http://iilab.utep.edu/stimuli.htm). Each stimulus featured a unique, emotionally neutral face. To determine whether the stimuli were matched in attractiveness, dominance, and trustworthiness, an independent group (N = 30) rated each stimulus on Amazon Mechanical Turk. This task consisted of 179 faces and were rated along the dimensions of “Attractiveness,” “Approachability,” and “Overall Positive or Negative Feeling.” From this task, we selected 120 faces that were the most neutral of these three dimensions.
Tasks
As detailed in previous work (Murty et al., 2016), subjects completed four tasks (Fig. 1). While in the scanner, participants first played the recipient in a DG, receiving varied monetary splits of $10 from trial-unique Dictators. The Dictator could divide the $10 however he saw fit, and subjects were required to accept the split. Monetary splits ranged from highly unfair ($0.10–$1.50 of $10) to relatively fair ($3.6–$5 of $10). Following the offer, participants were then asked how they felt about the split (on a 3-point scale; 1 = good to 3 = bad). Subjects interacted with 60 unique color images of Dictators (30 fair offers, 30 unfair offers).
After the DG, subjects completed a distractor task, a 6 min task composed of easily solvable math problems. After this short delay, subjects completed the Decision Task in which they could select a partner for a subsequent DG. On each trial, a face and a schematic gray face were presented side by side (Fig. 1A). Subjects were tasked with deciding whether they would like to play with that person or a new person who would be chosen at random (indicated by selecting the schematic gray face). Every trial contained a trial-unique face such that either the face was previously seen during the first DG or it was an entirely novel face. Faces were selected randomly without replacement from the 60 faces presented during the first DG and 30 never before seen faces. Each trial was presented for 4 s, during which participants could make decisions any time while the face was visible. Once a decision was made, subjects did not play with the target player or receive additional feedback about that player's behavior. Each trial was followed by the presentation of a jittered fixation cross lasting between 2–6 s (average = 4 s). Trial order was pseudorandomized across participants such that no more than three trials of the same condition (fair, unfair, novel) would appear in a row.
Outside of the scanner, subjects were given a surprise memory test in which we measured item memory (whether subjects recognized each face) and associative memory (memory for both the face and how much money the Dictator offered). We only tested memory for faces appearing in the Dictator Phase, not novel faces from the decision phase. Each trial consisted of either a face presented during the initial DG or an entirely new face, alongside a Likert scale of how confident they were that they had seen the face before during encoding (face memory: 1 = high confidence old, 2 = low confidence old, 3 = not sure, 4 = low confidence new, or 5 = high confidence new). To probe episodic memory for the offers previously made by each player, if subjects responded with a 1–3 for item memory, they had to indicate the monetary split associated with that person using a 5-point Likert scale ($0–$5, with $1 increments). After the experiment, subjects were funnel debriefed in a manner that effectively probes true believability of the task. Subjects answered on a 6-point Likert scale whether they had any doubt as to the veracity of the paradigm (1 = completely believed, 6 = did not believe). This allowed us to exclude subjects (N = 1) who indicated any disbelief that they were playing with real players.
fMRI acquisition and preprocessing
Functional imaging was performed using a Siemens Allegra 3 T head-only scanner located at the Center for Brain Imaging at New York University. Functional data were collected using an echoplanar pulse sequence (36 interleaved slices; TR = 2000 ms; TE = 30 ms; flip angle = 78°; FOV = 192 mm, voxel size = 3 mm isotropic). Slices were positioned ventrally to provide full coverage of the anterior temporal lobes and prefrontal cortex; this resulted in omission of the most dorsal parts of the superior parietal cortex. A high resolution T1-weighted anatomic scan [magnetization-prepared rapid acquisition gradient echo (MPRAGE) sequence, 1 mm isotropic] was also obtained for each subject after the decision task.
Functional MRI data were preprocessed using a pipeline designed to minimize the effects of head motion (Hallquist et al., 2013). This included simultaneous 4 d slice-timing and head motion correction, skull stripping, intensity thresholding, coregistration to the MPRAGE, nonlinear warping to MNI space, spatial smoothing with a 6 mm FWHM kernel, nuisance regression based on head motion (translation/rotation and their first derivative), and non-gray matter signal, and high-pass filtering (100 s). To account for magnetic equilibrium, the first four volumes of the functional scan were discarded.
Experimental design and statistical analyses
Behavioral analysis
We first tested whether players showed subjective responses that were congruent with the Dictator's offer during the DG. For each participant, we ran a regression with individual self-reported feelings of the offer as the dependent variable and offer value as the independent variable. To test for significance, we submitted r-to-z-transformed scores to one-sample t tests. Next, we tested whether individuals were more likely to approach Dictators that offered them more or less money during the DG. For each participant, we ran a general linear model (GLM), as implemented by the MATLAB “glmfit” function with participants' choice behavior during the decision task as the dependent variable, and offer amounts as the independent variable. To investigate the influence of different types of memory on choice behavior during the decision task, we ran an ANOVA where the dependent variable was choice, and within-subject predictors were value outcome and memory (Face and Offer memory). Outcome was split into binary categories of high/fair ($3.6–$5.00) offers and low/unfair offers ($0.10–$1.50) offers. We note that in social situations low values are often yoked to unfair offers (e.g., $0.10 of $10), and high values to fair offers, such that it is difficult to dissociate high reward from fair or equitable outcomes. Memory was split into the following three categories: no memory, face memory, face + offer memory. Evidence of a significant ANOVA effect was followed by post hoc t tests to specify the nature of the interaction. Trials in which participants had the opportunity of selecting the novel face stimuli were not included in these behavioral analyses.
fMRI first-level and group analysis
Imaging analysis focused on the data from the decision task. Data were modeled using the following three regressors of interest: adaptive choice, maladaptive choice, and novel choice. The adaptive choice regressor modeled trials in which participants decided either to reengage with players who made fair offers or to avoid engaging with players who made unfair offers in the DG. The maladaptive choice regressor modeled trials in which participants decided either to reengage players who made unfair offers or to avoid engaging with players who made fair offers in the DG. The novel choice regressor modeled all trials in which participants made choices about novel players either by selecting to play or to avoid them.
Given that prior research regarding the nature of the hemodynamic response function (HRF) in the hippocampus does not always follow a canonical shape during memory retrieval, we opted to estimate voxel-specific responses for each condition. This was performed by implementing the 3dDeconvolve function as implemented in AFNI, modeling each regressor over a 20 TR time period using 10-parameter sine series expansion. In addition to our regressors of interest described above, each individual's first-level model also included a seventh-order Legendre polynomial basis set to account for low-frequency drifts in the data. Preliminary analyses using a traditional temporal window of 13 TRs revealed that responses in the hippocampus failed to reach baseline at 26 s, despite other regional responses—for example, in the visual cortex—reaching baseline in the same time frame. Thus, to fully characterize the hemodynamic response in the hippocampus and provide a more complete and accurate representation of our data, we used an extended time period of 20 TRs.
We additionally performed a separate GLM to look at whether responses during the decision task represent individual performance during a later memory test. We implemented three regressors of interest representing (1) trials in which participants subsequently had memory for the Dictator and the offer made; (2) trials in which participants either had memory only for the Dictator but not the offer, or, no memory at all; and (3) trials in which decisions were made about Novel players. The same modeling procedures and inclusion of nuisance regressors were used as detailed in the GLM described above. We should note that for this analysis we were somewhat underpowered, as the mean number of trials in which participants had memory for the Dictator and their offer was 7.7 with a range of 1–19 trials.
Group-level analyses were conducted using a multilevel model implemented in the AFNI 3dMVM with each individual's voxel-specific HRF as an input, which tested for interactions between condition (i.e., adaptive, maladaptive) and time (i.e., each TR). We used 3dClustSim to identify significant clusters with the option to simulate noise using the spatial autocorrelation function given by a mixed-model run on noise estimates on first-level data. Height extant thresholds were set at a height level of p < 0.001 and a corrected α level of p = 0.01 (two tailed; using third-nearest neighbor clustering). We first estimated significance within a regions-of-interest mask, which included bilateral hippocampus (defined in the Automated Anatomical Labeling Atlas), as well as the regions within the striatum known to participate in affective and cognitive processes (defined by the Oxford-GSK-Imanova structural striatal atlas). This yielded a cluster of seven voxels; thus, any clusters consisting of seven or more voxels within our regions of interest (ROIs) were deemed significant. Notably, the definitions of the striatum include the entire ventral striatum and anterior and middle portions of the caudate. Additionally, we ran a whole-brain analysis that yielded a minimum cluster of 21 voxels.
Investigating differences in brain activation using a TR * condition interaction with a multilevel model cannot specify the direction of the effect. To characterize the direction of this interaction, post hoc analyses were run to unpack the nature of the clusters showing significant interactions at or above threshold within our region of interest. First, we plotted the entire estimated hemodynamic response function for the adaptive and maladaptive regressors, and identified time points where there were significant differences by running a t test on each individual TR. These post hoc tests were corrected for multiple comparisons using a false discovery rate reported at q < 0.1.
To further unpack the behavioral relevance of these differences while controlling for multiple comparisons, we isolated TRs that revealed peak differences between adaptive and maladaptive trials in both the positive (adaptive > maladaptive) and negative (maladaptive > adaptive) directions. We then independently compared activation at these TRs against the novel hemodynamic response to gain better traction of the hippocampal signal. Critically, novel stimuli were not included in the original analysis when identifying the significant clusters and could thus serve as independent comparison stimuli to decipher the nature of the interactions (i.e., these t tests are statistically independent from prior analyses). Finally, we computed a neural difference score of adaptive versus maladaptive from these two time points in an across-subject analysis to measure the effect on adaptive choice behavior (i.e., an independent statistical analysis).
Results
Behavioral findings
Confirming that participants were sensitive to the offers made by Dictators, a linear regression revealed that subjects reported feeling more positive about fair versus unfair offers from Dictators in the DG (β = 0.83 (0.01); t = 24.63; p < 0.001). During the decision phase, there were no significant differences in reaction time (RT) when individuals were making decisions in response to a fair Dictator [mean (SE) = 1.71 (0.12)], unfair Dictator [mean (SE) = 1.70 (1.11)], or novel Dictator [mean (SE) = 1.73 (0.11); p values > 0.40]. A linear regression revealed, however, that participants were more likely to reengage with Dictators that gave them fair versus unfair offers during the previous DG [β = 0.24 (0.08); t = 3.17; p = 0.005], indicating that, on the whole, subjects were making decisions that were adaptive and likely to benefit them in the future. Participants also made these adaptive decisions more slowly [i.e., selecting fair Dictators, avoiding unfair Dictators, 1.85 (0.10)] than maladaptive decisions [i.e., select unfair Dictators, avoid fair Dictators, 1.78 (0.10); t(19) = 3.84, p < 0.001]—which dovetails with recent work revealing that the hippocampus is involved in deliberating over valued options (Bakkour et al., 2019). Table 1 provides descriptive statistics of our item memory test. While there was significant item memory for faces encountered during the original dictator game (p < 0.001), there were no significant differences in item memory across fair and unfair [fair: mean (SE) = 0.63 (0.04); unfair: mean (SE) = 0.62 (0.04); t(19) = 0.78, p = 0.44]. For associative memory, there was evidence of significantly greater associative memory for unfair versus fair Dictators [fair: mean (SE) = 0.10 (0.02); unfair: mean (SE) = 0.33 (0.03); t(19) = 5.43, p < 0.001].
An ANOVA testing for interactions between memory and choice revealed that adaptive choices were dependent on an individual's memory of their prior experience with each Dictator (p < 0.001; Fig. 1B, Table 2). Post hoc t tests revealed that subjects did not show any differences in their tendency to approach fair and unfair Dictators when they did not have memory for the Dictator (no memory; t(19) = −0.14; p = 0.99) or when they only had memory for the Dictator but not how much the Dictator offered (face memory; t(19) = 0.58; p = 0.57). However, when individuals had intact memory for the Dictator and how much they previously offered, they decided to reengage with fair players far more often than unfair players (face + offer memory; t(19) = 4.05; p = 0.001). This finding was driven by exhibiting stronger associative memories for unfair (lower) offers compared with fair (higher) offers (t(19) = −5.13, p < 0.001).
Neuroimaging results
We first identified regions showing significant differences when individuals made adaptive versus maladaptive choices when encountering dictators. Significant differences were found in the right hippocampus (p < 0.01, small-volume corrected; MNI space: [x, y, z] = [33, −30, −9], k = 16; Fig. 2A) as well as a network of regions including the middle frontal gyrus, insula, and fusiform gyrus (p < 0.01, whole-brain corrected). Full time courses for regions showing significant differences outside of the hippocampus are depicted in Figure 3. Critically, we observed no significant activations within our striatal ROI using the same time course analysis that identified the hippocampal cluster—even when using a very liberal threshold of p < 0.01 uncorrected. Similarly, no clusters were identified using a canonical HRF (i.e., a double-gamma HRF) at a liberal threshold of p < 0.01.
Post hoc analyses of the right hippocampus cluster revealed a complex time course in which there were three discrete phases comprised of six TRs (time course series broken into three phases of equal TR length; Fig. 2B). In the early phase (TR 0–5), hippocampal activation did not differ across conditions. During the middle phase (TR 6–11), hippocampal activation for adaptive choice was suppressed compared with maladaptive choices (i.e., adaptive suppression). During the late phase (TR 12–17), hippocampal activation for adaptive choice was enhanced compared with maladaptive choices (i.e., hippocampal enhancement during adaptive choice). These findings suggest that there are two putative neural signals—a hippocampal suppression (maladaptive > adaptive) followed by a hippocampal enhancement (adaptive > maladaptive)—that support adaptive choice. Notably, the suppression signals were unique to the hippocampus and were not apparent in any regions identified in the comparison of adaptive to maladaptive trials (Fig. 3). Notably, post hoc analysis did not reveal any differences in the hippocampus as a function of the condition (fair, unfair) on the concurrent or previous trial, suggesting that our late signals were not a function of the content of the subsequent trial.
To gain more traction on the nature of these adaptive suppression and enhancement signals evoked in the hippocampus, we conducted additional post hoc analyses on TRs showing peak hippocampal suppression for adaptive choices (i.e., TR = 10, maladaptive > adaptive; Fig. 2B) and peak hippocampal enhancement for adaptive choices (i.e., TR = 12, adaptive > maladaptive; Fig. 2B). We first tested whether these adaptive suppression and enhancement signals predicted individual differences in adaptive decision-making. Adaptive choice was defined as the β-value in a regression between participants' propensity to approach players depending on how fair or unfair their offers were during the Dictator Game. We found that the attenuated hippocampal BOLD response during the middle suppression phase correlated with a greater likelihood of making adaptive choices (TR = 10; r(19) = −0.51, p = 0.02; Fig. 3, left). There was no significant relationship between the later hippocampal enhancements and adaptive choice (TR = 12, r(19) = −0.19, p = 0.61; Fig. 4A, right). However, the direct comparison between suppression and enhancement phases was not significant (p > 0.2). A similar coupling between hippocampal responses and adaptive behavior was observed at other time points as well, revealing a significant enhancement and suppression signal in the hippocampus (Table 3).
To test whether the adaptive suppression and enhancement signals showed properties reflecting more general memory retrieval, we compared these responses to when participants responded to novel players they had never seen before (i.e., novel choice), which allowed us to uniquely identify signals specifically linked to memory (previously encountered players) versus encoding for future adaptive choice (novel players). During the adaptive enhancement phase, there was a significant increase in hippocampal activation during adaptive choice compared with novel choice (TR = 12; estimated time series of the HRF: t(19) = 3.71, p = 0.002; Fig. 5A), and no differences comparing maladaptive choice and novel choice (t(19) = −1.14, p = 0.27), suggesting that memory-like responses only emerged when individuals made adaptive choices. In contrast, during the adaptive suppression phase, there were no significant systematic differences in hippocampal activation during either adaptive or maladaptive choice compared with novel choice (TR = 10, p values > 0.15). A similar trending pattern between hippocampal responses to adaptive versus maladaptive behavior was also observed at other time points, revealing a significant suppression in the hippocampus, while all TRs showing enhancements were unrelated to adaptive behavior (Table 3).
While these findings suggest that memory-related processes are important when enacting a choice that benefits oneself, documenting an early hippocampal signal would provide converging evidence that the relationship between the hippocampus and adaptive choice is robust. Accordingly, we explored hippocampal signals during choice when individuals had memory for Dictators and their offers versus trials in which a Dictator might be remembered but their offer was not, or when there was no memory for the Dictator at all. This analysis of subsequent memory during the choice period revealed a significant cluster in the right hippocampus (p < 0.01, small-volume corrected; [x, y, z] = [36, −18, −15], k = 21; Fig. 6A), the left middle frontal gyrus (p < 0.01, whole-brain corrected; [x, y, z] = [−47, 19, 37], k = 21; Fig. 6B) and right middle occipital gyrus (p < 0.01, whole-brain corrected; [x, y, z] = [25, −97, 10], k = 236 1; Fig. 6B). Within the hippocampal cluster, peak differences occurred at TR = 5, revealing greater activation when individuals had intact memory for Dictators and their offers compared with memory for the Dictator alone or no memory at all. We should note, however, that this analysis should be interpreted with caution, as there were relatively few trials in which participants had memory for the Dictator and their offer [mean number of trials (range) = 7.7 (1–19)].
Discussion
Based on recent work showing that episodic memory supports adaptive choice during single-shot learning (Murty et al., 2016), we tested the hypothesis that the hippocampus plays a critical role in guiding choice when decisions are based on limited previous social exposure. We observed that adaptive choices, selecting partners who treated you well in the past and avoiding those who treated you poorly, relies on a trace signal in the hippocampus evocative of repetition suppression seen during episodic memory (Köhler et al., 2005; Kumaran and Maguire, 2007; Chen et al., 2011; Howard et al., 2011). Since there was no evidence of striatal involvement during either adaptive or maladaptive choice, this provides evidence that hippocampal, rather than striatal, signals are associated with socially adaptive value-based learning.
Our results indicate that while early hippocampal responses (TRs 0–5) do not discriminate between adaptive and maladaptive choices, they do index subsequent memory. In contrast, middle (TRs 6–11) and later (TRs 12–17) hippocampal responses are sensitive to adaptive versus maladaptive choices. Specifically, we observed a suppression signal across subjects during the middle phase of the hippocampal time series response, which was associated with an individual's capacity to make socially adaptive choices during single-shot learning. In other words, deciding to reengage with someone who treated you well and avoid someone who treated you poorly was linked to the degree to which the hippocampus was suppressed. Prior research illustrates that repetition suppression in the hippocampus scales with memory strength (Gonsalves et al., 2005), which may be especially sensitive to memories for associations between discrete elements of an episode (Köhler et al., 2005; Howard et al., 2011)—such as players who made generous or selfish offers in our paradigm. Notably, the hippocampus did not distinguish between adaptive and novel trials during TR = 10, which challenges our interpretation that this suppression response reflects associative memory retrieval. However, our task structure cannot tease apart whether subjects are using retrieval strategies (i.e., recall to reject, generalization) or are newly encoding novel faces.
Accordingly, our findings that adaptive choices first show a repetition suppression signal, suggests that hippocampal sensitivity for the subjective perception of a person and how well they treat you may also be invoked during the choice itself (Desimone, 1996). The adaptive decision to play with good people and avoid bad people seems to be supported by the hippocampus indexing the relationship between the previous person encountered and the outcome of that particular exchange, which parallels prior work that intact episodic memory is needed to make these adaptive choices (Murty et al., 2016). In line with this, we also found that the right hippocampus was more active during decision-making trials when there was intact memory for the Dictators and their offers. Thus, when deciding, it is likely that the hippocampus exhibits both a signal supporting the current adaptive choice, as well as a detailed episodic memory of the original social exchange. However, it is impossible to explicitly probe episodic memory during decision-making, which leaves open the possibility that the hippocampus is not only representing consciously accessible memories, but implicit memories as well. If this were the case, the ability of the hippocampus to distinguish between individuals who should be approached versus avoided may be due in part to the absence of any conscious memory, which may help explain the fact that subjects reported intact episodic memory for a fraction of the dictators, and yet still managed to behave in an adaptive manner.
Together, these findings add to a literature illustrating that the hippocampus plays a larger role than just encoding episodic memory per se (Shohamy and Turk-Browne, 2013; Gerraty et al., 2014; Davidow et al., 2016). Prior work has elegantly demonstrated that by implicitly spreading value to never before experienced choice options (Wimmer and Shohamy, 2012), and by reactivating prior feedback-based learning experiences (Bornstein et al., 2017; Bornstein and Norman, 2017; Bakkour et al., 2019), the hippocampus interacts with the striatum to encode value. Here, we extend these findings by revealing that the ability to make socially adaptive choices with limited prior experience also relies on the hippocampus rather than the striatum. We interpret our hippocampal findings at TR = 10 to reflect processes directly related to decision-making as this signal was related to adaptive behavior both within and across participants and did not directly relate to subsequent source memory. However, given the lack of ability to assess causality in neuroimaging data and the late emergence of this signal, we cannot discredit that this signal may reflect postencoding processes that we did not capture in our behavioral measures.
After this initial suppression of the hippocampus, we further observed a late enhancement signal within the hippocampus, a signal exhibited well after the decision was executed (TR > 11). In this stage of the time series, the responses to adaptive decisions were not associated with individual differences in decision-making across subjects, suggesting that this signal did not directly contribute to choice. However, we did find that this hippocampal enhancement signal, unlike the suppression signal, differentiated between subjects making adaptive choices for a previously encountered person versus making choices about a never before seen stranger, signifying the existence of a discrete memory-related signal. Together, our data suggest that the hippocampus is likely involved in multiple aspects of the memory and decision-making process. This is best evidenced by the observation that at TR 5 the hippocampus predicts subsequent retrieval of source memory—which could theoretically reflect reconsolidation—but at TR 10 there is no observed effect directly related to memory (i.e., no differentiation between old and new faces or relationships to subsequent memory).
Although speculative, it is possible that a late-onset enhancement signal may not directly relate to the current choice, but may instead represent a postchoice strengthening of memory traces for future choices. This would allow the hippocampus to play a critical role in actively reinforcing the memory of the person (and whether that person was associated with good or bad outcomes) so that subsequent decisions made in similar contexts are easier to deploy. This would fit with research illustrating that enhanced activity in the hippocampus occurs when individuals successfully encode, integrate, or update associative memories (Spaniol et al., 2009; Bridge and Voss, 2014). Moreover, prior evidence demonstrates that the simple act of choosing strengthens the associative memories relating to the choice (Murty et al., 2015, 2019) and can even enhance the value of the selected option when the choice is inconsequential (Sharot et al., 2010)—which would indicate that the hippocampus plays a dynamic role during social learning. Future work can help to elucidate how current adaptive choices and their associated memories influence subsequent choice, and to identify whether the hippocampus is indexing an increase in value for the selected partner or a devaluing of the unselected partner (or perhaps a combination of both).
Together, our findings reveal that hippocampal responses exhibit a suppression signal that both differentiated between adaptive and maladaptive decisions on a trial-by-trial basis, while also being associated with the propensity to implement adaptive behavior across participants. If we consider these findings alongside theoretical work implicating the hippocampus in episodic simulation (Schacter et al., 2008, 2017; Gaesser et al., 2013) and model-based choice (Chersi and Pezzulo, 2012; Doll et al., 2012), it is possible that retrieving a trace memory of past experiences is akin to processes that also evoke cognitive maps of the decision space. For example, episodic simulation enables individuals to use past events to construct plausible future events (e.g., I probably will meet this person again), which in turn can help a person decide what is the best option to take (e.g., I should trust him next time).
Within the framework of model-based decision-making, it has also been proposed that the hippocampus generates representations of the contingencies of a task—cognitive maps that include rich information about previous experiences—which can then be used to make adaptive choices (Doll et al., 2015). Dovetailing with this, recent work illustrates that lesioning the hippocampus leads to a decrease in model-based choices (Vikbladh et al., 2019). Although model-based learning is mostly probed using trial-by-trial learning paradigms, the reliance on a rich, cognitive map of the decision space need not be unique to multishot learning and may actually be more prominent when decisions are informed by limited prior experience. Indeed, our findings that the hippocampus supports episodic memory retrieval and value-based choice hints that single-shot learning likely also leverages the retrieval of episodic memories to bolster a rich cognitive map of the future decision space, a finding that would be consistent with the view that computations in the hippocampus support multiple types of learning and decision-making (Shohamy and Turk-Browne, 2013; Doll et al., 2015). Future work can help bridge the current findings with the broader literature on both statistical and single-shot learning to explicitly probe the role of the hippocampus during model-based choice.
Footnotes
The research was funded by internal grants from the New York University Center for Brain Imaging. This work was also funded in part by a Brain & Behavior Research Foundation NARSAD Young Investigator Award and National Institutes of Health (NIH) Grant P20-GM-103645 to O.F.; and a NARSAD Young Investigator Award, and NIH Grants K01-MH-111991 and R21-DA-043568 to V.P.M.
The authors declare no competing financial interests.
- Correspondence should be addressed to Oriel FeldmanHall at oriel.feldmanhall{at}brown.edu or Vishnu P. Murty at vishnu.murty{at}temple.edu