Abstract
Humans need social closeness to prosper. There is evidence that empathy can induce social closeness. However, it remains unclear how empathy-related social closeness is formed and how stable it is as time passes. We applied an acquisition–extinction paradigm combined with computational modeling and fMRI, to investigate the formation and stability of empathy-related social closeness. Female participants observed painful stimulation of another person with high probability (acquisition) and low probability (extinction) and rated their closeness to that person. The results of two independent studies showed increased social closeness in the acquisition block that resisted extinction in the extinction block. Providing insights into underlying mechanisms, reinforcement learning modeling revealed that the formation of social closeness is based on a learning signal (prediction error) generated from observing another’s pain, whereas maintaining social closeness is based on a learning signal generated from observing another’s pain relief. The results of a reciprocity control study indicate that this feedback recalibration is specific to learning of empathy-related social closeness. On the neural level, the recalibration of the feedback signal was associated with neural responses in anterior insula and adjacent inferior frontal gyrus and the bilateral superior temporal sulcus/temporoparietal junction. Together, these findings show that empathy-related social closeness generated in bad times, that is, empathy with the misfortune of another person, transfers to good times and thus may form one important basis for stable social relationships.
Significance Statement
Humans feel close to others if they empathize with them. Here we test whether this feeling of social closeness remains if empathy is no longer elicited. Combining mathematical learning models and functional magnetic resonance imaging, we find that empathy with others’ pain establishes stable social closeness that is maintained even if the other person is feeling well again. Explaining the mechanism, we show that the stability of empathy-induced social closeness is based on the recalibration of an empathy-related learning signal in the anterior insula/inferior frontal gyrus and the temporoparietal junction. These findings reveal how empathy maintains social closeness and thus contributes to the formation of stable social relationships.
Introduction
Feeling close to other people is a principal human need (Hill, 2009; Baumeister and Leary, 2017). Documenting its importance, social closeness is linked to an increase in happiness, well-being (Kok and Fredrickson, 2014), and mental health (Cowan et al., 2021; Dempsey et al., 2021) and influences hormone levels associated with altruistic motivation (Brown et al., 2009). Moreover, feeling socially close enhances the willingness to behave prosocially toward others (de Waal et al., 2008; Spaans et al., 2018; Passarelli and Buchanan, 2020).
There is evidence that social closeness and connectedness depend on a shared understanding of the current situation and the others’ internal states (Baek and Parkinson, 2022). The more individuals learn about each other the closer they feel (Weaver and Bosson, 2011; Sprecher et al., 2013). Sharing of the others’ internal states is one of the main characteristics of empathy (Walter, 2012; Kanske et al., 2015; Stietz et al., 2019). In line with this reasoning, studies have found that increased social closeness is linked to increased empathy (Beeney et al., 2011; Morelli et al., 2015). Empathy enables us to share another’s emotions and thereby, provides an important way to connect with other people. As such, empathy has been characterized as the glue that holds relationships and societies together (Witenberg and Thomae, 2016; Calloway-Thomas et al., 2017). One experience that has been shown to incite empathy is observing another person in pain or misfortune (Xu et al., 2009; Beeney et al., 2011; Marsh, 2018). Based on influential models from social psychology (Davis, 1983; Hein et al., 2021), observing others in pain results in sharing the other’s emotions (called affective empathy) and thoughts or intentions (called cognitive empathy or theory of mind). Moreover, it has been shown that the temporal evolution of empathic responses, that is, repeatedly observing another’s pain (Hein, Engelmann et al., 2016), and choice behavior linked to empathy, that is, vicarious reward choices modulated by trait empathy (Lockwood et al., 2016), can be shaped by learning. Furthermore, mechanisms underlying social processes similar to forming social closeness toward another person such as impression formation (Frolichs et al., 2022), deciding how to engage with others on social media (Lindström et al., 2021), or building a trusting relationship with a stranger (Chang et al., 2010; Fareri et al., 2012; Fareri, 2019) have previously been successfully specified leveraging reinforcement learning models built on the feedback of others’ reactions or experiences.
Uncovering associated neural regions, previous studies have associated affective empathy with neural responses in the anterior cingulate cortex and the anterior insula (aIns), extending to the adjacent inferior frontal gyrus (IFG; Fan et al., 2011; Walter, 2012; Dvash and Shamay-Tsoory, 2014; Preckel et al., 2018; Cutler and Campbell-Meiklejohn, 2019; Stietz et al., 2019; Schurz et al., 2021). Supporting the close link between empathy and social closeness, neural activation in aIns and IFG (Beeney et al., 2011) as well as anterior cingulate cortex (Müller-Pinzler et al., 2015) were also found to change with varying degrees of social closeness. Cognitive empathy was mainly related to neural activation of the medial prefrontal cortex (mPFC), the superior temporal sulcus (STS), the temporal poles (TP), and the temporoparietal junction (TPJ; Dvash and Shamay-Tsoory, 2014; Preckel et al., 2018; Cutler and Campbell-Meiklejohn, 2019; Stietz et al., 2019; Schurz et al., 2021).
Taken together, these previous studies showed that observing the suffering of another person (e.g., pain) activates neural circuits that have been associated with affective and cognitive empathy.
Furthermore, there is evidence that these networks can be linked to empathy-related learning processes, such as overcoming ingroup biases (Hein, Engelmann, et al., 2016), learning to obtain rewards for another (Lockwood et al., 2016), or learning to avoid harming others (Lengersdorff et al., 2020). Antisocial traits that are negatively linked with empathy also modulate the process of learning to make choices that benefit others (Cutler et al., 2020; O’Connell et al., 2022) as well as the extent to which other-regarding information are considered during a social decision process (Rhoads et al., 2023).
Empathy or behavioral and neural empathic reactions in turn have been shown to be dependent on social closeness (Krienen et al., 2010; Beeney et al., 2011; Morelli et al., 2015). However, it remains unclear whether social closeness can be learned based on empathy (i.e., sharing the other’s internal state) and to what degree empathy-related closeness persists once empathy is no longer activated. In other words, does empathy-related social closeness prevail once the other person is feeling better and thus may no longer attract empathy?
Here, we used an adapted acquisition–extinction paradigm (Palminteri et al., 2015; Shiban et al., 2015; Dunsmoor et al., 2018), reinforcement learning modeling, and functional magnetic resonance imaging (fMRI) to address these questions in two independent studies. In an additional study, we tested whether the observed behavioral and computational results are specific for empathy-related closeness or reflect general learning-related changes in social closeness that also occur in other social contexts (reciprocity control study). Participants inside the fMRI scanner (Study 1) and in the laboratory (Study 2) observed painful stimulation of another person known to elicit empathy for pain (Lamm et al., 2007; Beeney et al., 2011; Hein, Engelmann, et al., 2016; Marsh, 2018; Grynberg and Konrath, 2020) in two conditions: a treatment condition and a control condition. In the first block of the treatment condition (the acquisition block), participants observed painful stimulation of the other person with high probability (80%). In a second block (the extinction block), they observed the other receive painful stimulation with low probability (20%). In the control condition, participants observed painful stimulation in another person at chance level in both blocks (50%; Fig. 1A). In each trial, after observing the stimulation of the other person, participants rated their emotional reaction to the stimulation and subsequently indicated how close they felt to the other. To do so, they moved a mannequin (representing themselves) toward or away from a mannequin representing the other person (Fig. 1B).
Visualization of the design and trial structure. A, Participants sequentially underwent two counterbalanced conditions. In the treatment condition, they interacted with a first partner and performed two blocks of the motive task. In Block 1, empathy was reinforced in 80% of the trials and in Block 2 in 20% of the trials. They performed the same tasks again with a new interaction partner in the control condition. Here, empathy was reinforced in 50% of the trials in both blocks. The order of treatment and control conditions was counterbalanced across participants. B, Exemplary trial of the fMRI study and the behavioral replication study. At the beginning of each trial, participants observed that the other person received a painful stimulation (high pain trial = reinforced trial) or a nonpainful stimulation (no-pain trial = nonreinforced trial). Then, following a fixation cross (1,000–2,000 ms), participants rated how they felt after observing this feedback. After a second 4,000–6,000 ms fixation period, participants (green mannequin) indicated how close they felt to the other person. C, Exemplary trial of the reciprocity control study. At the beginning of each trial, participants saw the ostensible decision screen of the other person and were shown whether the other person had decided to help them (help trial = reinforced trial) or not to help them (no help trial = nonreinforced trial). After a fixation cross (1,000–2,000 ms), participants rated how they felt after observing this feedback. Then followed a second fixation cross (4,000–6,000 ms) and lastly, participants indicated how close they felt to the other person.
This setup allowed us to investigate the formation of empathy-related closeness in the acquisition block and the stability of empathy-related closeness in the extinction block. To formalize the dynamic changes in social closeness with changing activation of empathy (i.e., after observing pain or nonpain in the other), we used a set of reinforcement learning models. Reinforcement learning models mathematically describe the process of learning specific stimulus–outcome (i.e., reward vs punishment) associations via trial and error (Rescorla and Wagner, 1972), which can be extended to associations between persons and outcomes. The Rescorla–Wagner model assumes that learning is driven by prediction errors, reflecting the difference between an observed and an expected feedback or outcome.
Inspired by previous work demonstrating that watching others receive painful stimulation elicits empathy (Lamm et al., 2007; Beeney et al., 2011; Hein, Engelmann, et al., 2016; Marsh, 2018; Grynberg and Konrath, 2020) and that empathy is linked to social closeness (Morelli et al., 2015), we hypothesized that watching another person receiving pain elicits a learning signal that is used to update social closeness. In more detail, we assume that empathy elicited by observing another person in pain increases social closeness that deviates from the prior expectation of felt social closeness. This empathy-related deviation of expected social closeness may generate a prediction error that in turn results in a dynamic update of social closeness depending on the “empathy reinforcer” (i.e., observing another’s pain). In the acquisition block with frequent empathy reinforcers, participants should show a learning-related increase in social closeness, captured by a dynamic increase in closeness ratings that reflects the respective prediction error estimate from our learning model. In the extinction block, when the other person only occasionally received empathy-inducing painful stimulation, we hypothesized a learning-related decay of empathy-related social closeness. However, if empathy-related closeness resists extinction, it should not decay when empathy reinforcers become rare in the extinction block; that is, we should find no significant differences in closeness ratings between the acquisition and the extinction block.
On a neural level, based on the previous literature (Lamm et al., 2007; Beeney et al., 2011; Morelli et al., 2015; Hein, Engelmann, et al., 2016; Marsh, 2018; Grynberg and Konrath, 2020), it is plausible to assume that changes in empathy-related social closeness should be associated with changes in activation in a network of brain regions that have been related to empathic responses, including the aIns and the adjacent IFG as well as the anterior and mid cingulate cortex, the TPJ, the STS, the mPFC, and the temporal poles. If empathy-related social closeness decays when the other person is suffering less, we should observe a decrease in neural activation in this neural network. Alternatively, if empathy-related social closeness resists extinction, empathy-related neural responses should be maintained in the extinction block with no significant difference to the acquisition block.
Materials and Methods
Participants
We recruited 107 right-handed healthy female participants via online platforms and flyers posted around the university campus in Würzburg (convenience sample, see Table 1 for mean age and spread). Participants were assigned to three different studies: two studies investigating the formation and stability of empathy-related social closeness (one fMRI and one behavioral replication study) and one behavioral reciprocity control study. We trained two female students that served as confederates in all three studies.
Demographics and average questionnaire score of participants in the three studies
Table 1-1
Correlation between trait empathy and individual learning rates. Results of the Spearman correlations testing the relationship between individual learning rates of the winning model for the fMRI and the behavioral replication study (αempathy(model 3)) and trait empathy, as well between individual learning rates of the two possible model for the reciprocity control study (αreciprocity(model 1)/ αreciprocity(model 3)) and trait reciprocity. EC = empathic concern, PT = perspective-taking, PR = positive reciprocity. Download Table 1-1, DOCX file.
We chose female participants as well as female confederates to control for gender and avoid cross-gender effects (Han et al., 2008; Christov-Moore et al., 2014; Bluhm, 2017). The confederates were students who had been trained to act as naive participants. We ensured that participants did not know either of the confederates prior to the experiment by asking confederates beforehand. Before the experiment began, written informed consent was obtained from all the participants. The study was approved by the local ethics committee (268/18). Participants received monetary compensation [26.80 ± 3.30 Euros (mean ± SD)]. Monetary compensation was based on a fixed show-up fee and an individual pay-out based on the behavior in a decision task, which participants performed in addition.
We had to exclude seven datasets (five from the fMRI study and one each from the behavioral replication and the reciprocity control study), because the estimation of learning models was not possible due to a lack of variance in ratings (four participants), falling asleep (two participants), or technical problems (one participant). Thus, we analyzed 46 datasets for the fMRI study, 27 datasets for the behavioral replication study, and 27 datasets for the control study. The mean age was comparable between studies (F(2,106) = 0.99; p = 0.376; see Table 1 for an overview of sample characteristics). A post hoc sensitivity analysis using G*Power 3.1 indicated that given α = 5%, and considering three predictors in the regression model, the sample sizes of the respective studies had 80% power to detect a true effect with an effect size of f ≥ 0.18 (F = 2.68) in the fMRI study and an effect size of f ≥ 0.23 (F = 2.73) in the behavioral replication and the control studies.
fMRI study and behavioral replication study
Procedure
Prior to the tasks, the individual thresholds for pain stimulation (see below, Pain stimulation for details) were determined for the participants and the confederates. Thus, participants had a first-hand experience of the pain stimulation they would observe in others.
Next, the participants and confederates were assigned their different roles in a manipulated lottery of drawing matches. Participants always drew the last match in order to ensure she was assigned her designated role (observer). The confederates were assigned the role of pain recipients and served as treatment or control partner counterbalanced across participants. In the fMRI study, the respective confederate (treatment partner in the treatment condition and control partner in the control condition) was seated on a chair to the left of the participant with her hand visible to the participant. In the behavioral replication study, the respective confederate was seated next to the participant in a soundproof cabin facing the opposite direction such that no one could see the other’s screen.
The fMRI experiment consisted of two conditions that were presented within-subject: (1) the treatment condition in which the participants observed painful stimulation of one of the confederates (treatment partner) with high probability (acquisition) and low probability (extinction) and (2) the control condition in which participants observed painful stimulation of the other confederate (control partner) with chance probability in both blocks (Fig. 1). In the acquisition block, participants observed that the partner received ostensibly painful stimulation in 80% of the trials, that is, 80% empathy reinforcers. In the extinction block, they observed painful stimulation of the same confederate in 20% of the trials, that is, 20% empathy reinforcers. In the control condition, participants observed painful stimulation of the second confederate in 50% of the trials of both blocks. Each block consisted of 25 trials. The order of treatment and control conditions was counterbalanced across participants. Participants observed painful stimulation of different individuals in the treatment and the control condition to avoid spillover effects and to keep the ostensible pain stimulation of the other person in a reasonable range.
Participants spent ∼60 min in the scanner and the entire procedure took ∼2.5 h. The behavioral replication study lasted ∼2 h. To avoid possible reputation effects (Gächter and Falk, 2002; Engelmann and Fischbacher, 2009), which could influence participants’ behavior, participants were informed at the beginning that they would not meet the others after the experiment. In more detail, at the end of the fMRI study, the second confederate left and the participant remained in the scanner for anatomical image acquisition. At the end of the behavioral replication study, the confederate left and participants remained in the cabin to complete the same questionnaires as in the fMRI study.
Task
At the beginning of each trial, participants were either shown a fully filled flash in the partner’s color (symbolizing a painful stimulation of the partner, i.e., a reinforced trial) or a partly filled flash in the partner’s color (symbolizing a nonpainful stimulation of the partner, i.e., a nonreinforced trial) for 2,000 ms. The respective flash was followed by a fixation cross (1,000–2,000 ms). Subsequently, participants indicated how they felt after having observed the partner’s stimulation (“How do you feel?” in German) on a visually displayed continuous slider scale with the extreme point anchors “very bad” (internally corresponding to 0) and “very good” (internally corresponding to 100) and had to respond within 10 s (6 s in the laboratory study). This flash was followed by a fixation cross displayed for 4,000–6,000 ms. At the end of each trial, participants saw a continuous slider scale (internally ranging from 0 to 100) that asked the participant to indicate how close they felt to the other person at this moment (“How close do you feel to the other person?” in German). Participants were asked to respond within 10 s (6 s in the laboratory study). The trial structure is visualized in Figure 1B.
Reciprocity control study
Procedure
The procedure was identical to the behavioral replication study, except that now the participants were assigned as pain recipients and the confederates could decide to give up money to spare them from pain, a procedure that has been shown to activate the motive to repay this favor (Hein et al., 2016a; Saulin et al., 2022). Similarly to empathy, this social norm of reciprocity can increase closeness (Neyer et al., 2011; Adams and Miller, 2022). Inducing reciprocity in the reciprocity control study allowed us to test if potential learning-related changes in social closeness are specifically related to empathy or generalize to other socially ubiquitous contexts.
Each block of the control learning task consisted of 25 trials. In the treatment condition (corresponding to two interaction blocks with one confederate), participants observed that the partner ostensibly decided to help them in 80% of the trials in Block 1 (acquisition) and in 20% of the trials in Block 2 (extinction). In the control condition (corresponding to the two interaction blocks with the other confederate), participants observed that the partner ostensibly helped them in 50% of the trials in Block 1 as well as Block 2. Again, the order of the treatment and control condition were counterbalanced across participants. To avoid possible reputation effects (Gächter and Falk, 2002; Engelmann and Fischbacher, 2009), which could influence participants’ behavior, at the beginning of the experiment, participants were informed that they would not meet the ostensible other participants after the experiment. At the end of the study, the confederate left and participants remained in the cabin to complete the same questionnaires as in the other two studies.
Task
The trial structure was analogous to the fMRI study and the behavioral replication study described above using the same assessment of trial-by-trial social closeness. Each trial started with the display of a screen, in which the two possible options were visualized side-by-side using a fully filled flash in the color of the participant (symbolizing the option to take the monetary reward and not help the participant) and a crossed out fully filled flash in the color of the participant (symbolizing the option to forego the monetary reward and help). Participants were told that this was the decision screen, which the interaction partner also saw while making her decision to either spare or not spare the participant from painful stimulation. This screen was shown for a jittered length of 2,000–4,000 ms, followed by the display of the ostensible decision of the interaction partner. If the decision was to help (reinforced trial), the crossed-out flash was highlighted by a box in the color of the interaction partner. If the decision was not to help (nonreinforced trial), the fully filled flash was shown highlighted by a box in the color of the interaction partner. After another fixation cross (1,000–2,000 ms), the emotion rating scale was shown asking the participant how they felt after observing the partner’s decision (“How do you feel?” in German). Participants were asked to respond within 6 s. After a jittered fixation cross (4,000–6,000 ms), participants were asked to indicate how close they felt to the other person at that moment (“How close do you feel to the other person?” in German) on a continuous slider scale (internally ranging from 0–100) within 6 s. After a fixation cross (1,000–2,000 ms), the next trial started.
Questionnaires
At the end of the respective main experiments, participants filled out questionnaires capturing trait empathic concern and perspective taking/cognitive empathy [empathic concern and perspective taking subscales of the Interpersonal Reactivity Index (IRI); Davis, 1980]. Conceptually, scores on the empathic concern subscale have been related to emotional empathy and scores on the perspective taking subscale to cognitive empathy (Davis, 1980, 1983). Moreover, they completed questionnaires measuring individual differences in trait reciprocity (Perugini et al., 2003) as well as participants’ impressions of the other individuals (confederates; Hein et al., 2010; Hein, Engelmann, et al., 2016; modified from Batson et al., 1988). Questionnaire scores were comparable between studies, all ps > 0.29. Average scores and standard deviations are reported in Table 1.
Pain stimulation
In the fMRI study, painful stimulation was applied using a Digitimer DS7A constant current stimulator and an MRI compatible surface electrode attached to the left lower inner arm. Shock segments consisted of a single 1 ms square-wave pulses. For pain stimulation in the laboratory, we used a mechano-tactile stimulus generated by a small plastic cylinder (612 mg). The projectile was shot against the cuticle of the left index finger using air pressure (Impact Stimulator, Labortechnik Franken, Release 1.0.0.34).
Importantly, the intensity of the painful stimulation in all studies was based on the same subjective criterion, determined in an individual pain thresholding procedure. Participants received pain stimulation with slowly increasing intensities starting with the lowest value of 0.00 mA (fMRI study) or 0.25 mg/s (replication and reciprocity control study) and increasing in steps of 0.05 mA (fMRI study) or 0.25 mg/s (replication and reciprocity control study) and rated its unpleasantness on a scale from 1 (no pain at all, but a participant could feel a slight tingling) to 10 (extreme, hardly bearable pain). In the main experiment, a subjective value of 8 (corresponding to a painful, but bearable pain) was used for painful stimulation and a subjective value of 1 was used for nonpainful stimulation.
Regression analyses
In all linear mixed effects regression models, we included participant as random intercept in order to account for shared error variance across multiple data points, that is, the within-subjects variables. Random slopes were included for continuous variables if these variables were also included as a fixed effect. As our categorical variables only yielded two levels, we did not include random slopes for categorical variables.
As a manipulation check, we first checked whether emotion ratings significantly differed for observed pain versus no-pain. To test this, we ran a linear mixed models analysis with the fixed effects of trial type (reinforced vs nonreinforced), study (fMRI vs replication study), and their interaction, participant as random intercept and the dependent variable emotion rating.
In order to test whether we successfully reinforced empathy, we conducted a linear mixed models analysis with empathy subscale (categorical dummy variable coding for “empathic concern” vs “perspective taking” subscale of the IRI; Davis, 1983), trait score (continuous: score on the respective subscale), trial type, study, and their interaction as fixed effects, participant and trial number as random intercept, and emotion ratings as dependent variable. In the behavioral reciprocity control study, the analogous analysis was conducted but using positive reciprocity as trait measure of reciprocity (positive reciprocity subscale of the PNR; Perugini et al., 2003).
In order to test the influence of condition, block, and trial number on social closeness, we conducted linear mixed models with condition (treatment vs control), block (Block 1 vs Block 2), trial number (1–25), and study (fMRI vs replication study) as fixed effects, participant as random intercept, trial number as random intercept for participant, and trial-by-trial closeness ratings as dependent variable.
To motivate the assumption underlying the reinforcement learning models, we inspected the relationship between closeness updating, outcomes, and emotion ratings using linear regression models instead of RL models. This analysis aims to confirm an assumption that underlies our RL models, namely, that outcomes, that is, observing another’s pain or being saved from pain by another person, lead to changes in closeness ratings. To this end, we examined the influence of outcome on closeness updating within the two social contexts included in our study, namely, empathy and reciprocity. To closely follow the approach we use for our RL models, we used a two-level general linear model (GLM) with closeness updating [closeness rating(t) − closeness rating (t − 1)] as the dependent variable and three predictors, namely, outcome, block, and treatment (along with their interactions). To assess trial-by-trial changes in closeness updating, we first estimated this model for each subject individually and then entered the beta coefficients from each subject into a second-level ANOVA with Experiment Type (empathy/reciprocity) as between-subjects factor. Briefly, the results from these confirmatory analyses indicate that closeness ratings show a relationship with outcome for both reciprocity and empathy (Extended Data Figs. 2-1, 2-2), indicating that in the context of our studies they can be used as a proxy for expected value in RL models.
To test whether the neural sensitivity to empathy reinforcers that was related to individual recalibration (see below, fMRI statistical analysis for details) is also linked to social closeness in the treatment condition, we conducted two follow-up analyses. In the first model, we modeled social closeness ratings as a function of the neural betas extracted from IFG/aIns, block (acquisition vs extinction), empathy subscale (empathic concern vs perspective taking), and trait score. Our model also included participant as random intercept and trial number as random slope. The second model was conducted analogously but using the neural betas extracted from STS/TPJ.
Linear mixed model analyses were conducted in R (version 4.0.4; R Core Team, 2019) using the packages lme4 (Bates et al., 2014) and car (Fox et al., 2018). For mixed models, we report the chi-square values derived from Wald chi-square tests using type 3 sum of squares from the Anova() function (car package). For predefined contrasts, we report the t values derived from the summary() function. Simple slopes extracted from the linear mixed models are reported with 95% confidence intervals using the emtrends function (emmeans package; Lenth et al., 2019).
Specific Bayesian follow-up mixed models analyses to explicitly test for null effects were conducted using the brms package, and Bayes factors were determined using the function bayes_factor (Bürkner, 2021). Bayesian t tests were conducted using the function ttestBF from the package BayesFactor (Morey et al., 2022).
Computational modeling
To identify the computational mechanisms of the formation and maintenance of empathy-related social closeness, we tested three different learning models against each other (Fig. 3). Specifically, our baseline model, which implemented only the standard Rescorla–Wagner learning rule (Model 1; Fig. 3A), was compared with two recent adaptations (Models 2 and 3) that allowed us to test the role of specific processes, namely, differential learning rates for positive and negative feedback (Garrett and Daw, 2020) and context-dependent recalibration of the prediction error (Bavard et al., 2018; Palminteri and Lebreton, 2021). The first adaptation assumes different learning rates for positive prediction errors and negative prediction errors, that is, for the learning and the unlearning of an association (Model 2; Fig. 3B). If, for example, recent experiences more strongly influence surprisingly positive than surprisingly negative feedback, the learning rate for positive prediction errors will be larger than the learning rate for negative prediction errors. In the context of empathy-related and reciprocity-based social closeness, such a finding would entail that social closeness more rapidly increases in the acquisition block than it decreases in the extinction block.
In the second adaptation, we hypothesized that the assumed outcome values of the respective feedback (i.e., R = 1 for reinforcer feedback and R = 0 for nonreinforcer feedback) may vary depending on the respective context (e.g., empathy motive vs reciprocity motive; Model 3; Fig. 3C). That is, the outcome value is recalibrated depending on the context. This recomputed outcome value is then used to compute the prediction error which means that the learning signal itself is recalibrated (cf. Palminteri et al., 2015, p. 11 “[…] an outcome should be compared before updating option values.”). The larger this recalibration, the smaller the learning signal associated with a reinforced trial and the larger the learning signal associated with a nonreinforced trial and vice versa. Context-dependent recalibration therefore allows social closeness to continue to increase in the extinction block despite a high probability for nonreinforced trials.
Based on these models, we aimed to test whether empathy stability can be understood (1) in terms of asymmetrical updating of the learning signal (i.e., different learning rates for reinforced and nonreinforced trials) or (2) in terms of recalibration of the value associated with the feedback in each trial (i.e., a value different from 1 in reinforced trials and different from 0 in nonreinforced trials). Hence, we tested which out of three models in our model space best describes participants’ behavior. Furthermore, we evaluated an alternative modeling approach using a two-level model which we compare with the winning model (see Extended Data Fig. 3-6 for details and results).
In the simplest model (basic model), the estimated motive-driven closeness (SocialCloseness, corresponding in the present context to the expected value which is traditionally denoted V) at trial t is updated with prediction error δ and free parameter
α only. Specifically, the prediction error is calculated as the difference between the actual outcome and the prediction:
Third, based on previous work (Palminteri et al., 2015; Palminteri and Lebreton, 2021) showing that subjective outcome values depended on whether learning took place in a context of primarily negative or primarily positive outcomes, we assumed outcome values of the respective feedback (i.e., R = 1 for reinforcer feedback and R = 0 for nonreinforcer feedback) may actually be recalibrated depending on the respective social context. That is, in the context of our studies, subjective outcome values associated with observing another’s pain or nonpain may differ from the theoretically assumed values of 1 (observed pain) and 0 (observed nonpain). Additionally, subjective outcome values may again be different for receiving help from another or not receiving help.
Consequentially, for the present studies, the learning context was primarily defined by the respective motive which was reinforced (empathy motive vs reciprocity motive). Hence, ω may on average be different for individuals in the fMRI study and the behavioral replication study (both empathy motive) than for individuals in the behavioral reciprocity control study (reciprocity motive). To test whether the learning of motive-driven closeness can be understood in these terms, we added a third model (individual calibration model), in which the proposed outcome value is recalibrated by subtracting an additional free parameter ω (Eq. 4):
To explore whether individual ω-values are linked to trait empathy, we conducted an additional linear model with ω-values from the fMRI and the behavioral replication study as dependent variable. We included empathy subscale (“perspective taking” vs “empathic concern”; Davis, 1983), trait scores and their interaction as predictors and study (fMRI vs behavioral replication) as control variable. Moreover, we conducted follow-up correlation analyses to test whether individual learning rates were linked to individual trait measures. The results showed a positive relationship between learning rate and empathic concern scores, indicating higher learning rates in participants that scored high on trait empathic concern. No effect was found for trait reciprocity and individual learning rates in the reciprocity control study (see Extended Data Table 1-1 for results).
Exploratory models
Based on the results obtained from the first model comparison and in order to more closely investigate the computational basis of empathy-related social closeness stability, we developed a second model space. That is, we tested whether the individual recalibration of the outcome value in these two groups depended on either the condition (treatment vs control), block (Block 1 vs Block 2), or both. The second model space hence comprised four different models either assuming only one general recalibration parameter ω, assuming condition-specific recalibration parameters ωtreatment and ωcontrol, assuming block-specific recalibration parameters ωblock 1 and ωblock 2, or assuming condition- and block-specific recalibration parameters ωtreat1, ωtreat2, ωcontrol1, and ωcontrol2.
We hypothesized that the maintenance of social closeness in the extinction block may be mechanistically subserved by a reversal of the respective feedback value for reinforced and nonreinforced trials (i.e., nonreinforced trials actually become the reinforced trials in Block 2 of the treatment condition). Such a reversal should entail a large recalibration of the feedback value in Block 2 of the treatment condition, since the larger the recalibration value ω, the more the feedback value of reinforced trials moves closer to 0 (i.e., the original value of nonreinforced trials), and the more the feedback value of nonreinforced trials moves closer to 1 (i.e., the original value of reinforced trials; Eq. 4). Hence, it is plausible to assume that the extent of recalibration may be large in Block 2 of the treatment condition but small in Block 1 of the treatment condition. No such differentiation should be observed for the control condition.
In order to test this hypothesis, we tested which out of four new models best described participants’ behavior (see Extended Data Fig. 3-1 for visualization of model space II). Model 1 corresponds to the winning model from model space I (individual calibration model); Model 2 is an extension of this model in that it assumes different values of recalibration for the treatment and the control condition but across both blocks (condition-specific recalibration model); Model 3 assumes different values of recalibration for Block 1 and Block 2 across both conditions (block-specific recalibration model); and Model 4 assumes different values of recalibration for Block 1 of the treatment condition, Block 2 of the treatment condition, Block 1 of the control condition, and Block 2 of the control condition(condition- and block-specific recalibration model). If the emotion reversal effect can indeed be understood in terms of outcome value reversal from the first to the second block of the treatment condition, the most complex model (condition- and block-specific recalibration model) should be most likely to have generated the data, revealing moderate recalibration in the two blocks of the control condition, low recalibration in the first block of the treatment condition, and high recalibration in the second block of the treatment condition.
Model optimization and comparison
The parameters
A lower LPP value indicates that a model can explain the data better; however, the nLPP does not take a model’s complexity into consideration. To address this issue, we then applied the Laplace approximation to the model evidence (LAME) to penalize goodness-of-fit (i.e., the measure of nLPP for each subject) with model complexity (i.e., number of parameters). The LAME for each model was computed according to Equation 6:
To test which model out of the model space is most likely to have generated a certain dataset, we fed the LAME (from each subject in each model) into group-level random-effects analysis using the mbb-vb-toolbox (http://mbb-team.github.io/VBA-toolbox/; Daunizeau et al., 2014). This toolbox performs Bayesian model selection and estimates two indicators of model performance: the exceedance probability (EP) and the expected model frequencies (EF) for each model. Specifically, the EP of a model quantifies the probability for a given model to have generated the data relative to the other models in the model space. Commonly, an EP higher than 95% is an indicator of convincing evidence for a model to be most likely to have generated the data compared with other models. The expected frequency EF of a model quantifies the probability that the model generated the data for any randomly selected subject. Note that the EF should be higher than chance level given the number of models in the model space (in our case higher than 1/3).
The modeling was conducted using MATLAB 2018b. The estimated rating (SocialCloseness) was initialized as the actual rating in the first trial in each block. All the parameters were optimized using MATLAB’s fmincon function with random starting points, ranging from 0 to 1.
fMRI data acquisition
Imaging data was collected using a 3 T MRI scanner (Skyra syngo, Siemens) with a 32-channel head coil. Functional imaging was performed with a multiband EPI sequence of 42 transversal slices oriented along the subjects’ anterior to posterior commissure (AC-PC) plane and distance factor of 50% (multiband acceleration factor of 2). The in-plane resolution was 2 × 2 mm² and the slice thickness was 2 mm. The field of view was 216 × 216 mm², corresponding to an acquisition matrix of 108 × 108. The repetition time was 1,340 ms, the echo time was 25 ms, and the flip angle was 60°. Structural imaging was conducted using a sagittal T1-weighted 3D MPRAGE with 240 slices and a spatial resolution of 1 × 1 × 1 mm³. The field of view was 256 × 256 mm², corresponding to an acquisition matrix of 256 × 256. The repetition time was 2,300 ms, the echo time was 2.96 ms, the total acquisition time was 3:50 min, and the flip angle was 9°. We obtained, on average, 1,215 (SE = 5.07 volumes) EPI-volumes in the control condition and 1,208 (SE = 4.26 volumes) EPI columns in the treatment condition for each participant. We used a rubber foam head restraint to avoid head movements.
fMRI preprocessing
Preprocessing and statistical parametric mapping were performed with SPM12 (Wellcome Department of Neuroscience) and MATLAB version 9.2 (MathWorks). Spatial preprocessing included realignment to the first scan and unwarping and coregistration to the T1 anatomical volume images. Unwarping of geometrically distorted EPIs was performed using the FieldMap Toolbox. T1-weighted images were segmented to localize the gray and white matter and cerebrospinal fluid. This segmentation was the basis for the creation of a DARTEL Template and spatial normalization to Montreal Neurological Institute space, including smoothing with a 6 mm (full-width at half-maximum) Gaussian kernel filter to improve the signal-to-noise ratio. To correct for low-frequency components, a high-pass filter with a cutoff of 128 s was used.
fMRI statistical analysis
First-level analyses
First-level analyses were performed with a GLM, using a canonical hemodynamic response function. Regressor lengths were defined from stimulus onset until the individual response was made by pressing a button (resulting in a time window of 1,000 ms + individual response time) for stimuli that required a response (emotion rating phase, closeness rating phase) and from stimulus onset to stimulus offset for stimuli that were just observed by participants (feedback phase, i.e., observing the partner’s pain vs nonpain). The main regressor of interest was the emotion rating phase (scale onset until button press), because this is the phase that is most clearly linked to the explicit empathic reaction (i.e., the cognitive reflection on the initial emotional reaction that is driven by both by affective and cognitive processes). Using a parametric modulator, the trial-by-trial activation during emotion ratings in the treatment condition and the control condition was regressed against trial type (value 1, observed pain; value 0, observed nonpain). The resulting parametric modulator tracks regions that show stronger activation when participants rate their emotions after observing other’s pain compared with nonpain in the treatment and the control condition (PM trial type). We also defined the contrast between the parametric modulator of the treatment and the control condition. The closeness phase (scale onset until button press) and the feedback phase (stimulus onset until stimulus offset) were added as further regressors to account for variance during these task phases. For completeness, we defined the same parametric modulators and contrasts between parametric modulators (treatment vs control condition) also for the closeness and the feedback phases. An additional task of no interest was modeled as additional regressor. The residual effects of head motions were corrected by including the six estimated motion parameters for each participant and each session as regressors of no interest. To allow for modeling all the conditions in one GLM, an additional regressor of no interest was included, which modeled the potential effects of session.
Second-level analyses
In a first analysis, we investigated which regions responded more strongly to observed pain versus nonpain for each phase of interest. To do so, we first conducted one-sample t tests with the parametric modulators defined at the first level (i.e., tracking changes in activation according to trial type: value 1, observed pain; value 0, observed nonpain) for the emotion rating phase. Second, we conducted separate one-sample t tests with the first-level treatment versus control contrast images of the parametric modulators. Third, we investigated brain regions that capture the feedback recalibration when participants rate their emotional reaction after observing the other’s stimulation. To do so, we conducted second-level regression analyses with the contrast images of the parametric modulator (treatment vs control) of the emotion rating phase and the individual ω values as covariate. For completeness, we conducted and report the analogous analyses for the closeness rating phase and the feedback phase.
As recommended, a cluster-forming threshold of p < 0.001 uncorrected (Woo et al., 2014; Eklund et al., 2016; Yeung, 2018) was used, and where not stated otherwise, whole-brain level family wise error (FWE cluster-corrected statistics are reported at an α level of p < 0.05).
To test the relationship of neural activation related to individual recalibration with closeness ratings, emotion ratings, and trait empathy, we extracted beta values from the parametric modulator trial type during acquisition (treatment, Block 1) and extinction (treatment, Block 2) in the emotion rating phase from the resulting bilateral clusters in TPJ/STS and left IFG/aIns using MarsBar (Brett et al., 2002). Extracted beta values were added as predictors in two separate linear mixed models together with block (acquisition vs extinction) and empathy subscale (empathic concern vs perspective taking subscale of the IRI; Davis, 2006), trait score, and their interaction as fixed effects, participant and trial number as random intercepts, and social closeness as dependent variable.
Data and code availability
Data and code are available at github (github.com/AnneSaulin/empathy_social_closeness). The design and the confirmatory analyses were preregistered on the Open Science Framework (https://osf.io/yz9rq/registrations).
Results
Results of the fMRI and the behavioral replication study
Manipulation checks
To confirm that participants differentiated between the two trial types (reinforced vs nonreinforced), we first tested whether participants’ emotional reaction to observed pain differed to observed nonpain. This analysis of the emotion ratings showed a main effect of trial type (pain vs nonpain) across both studies (χ2 = 524.05; p < 0.001; β = 1.10; SE = 0.05), indicating that participants emotionally distinguished between those trials in which the partner received painful stimulation versus nonpainful stimulation in the fMRI study and the replication study. This effect was even stronger in the replication study than that in the fMRI study (trial type × study interaction: χ2 = 32.03; p < 0.001; β = −0.34; SE = 0.06).
To test whether the emotion ratings were associated with external measures of affective and cognitive empathy, we conducted a linear regression analysis with the emotion ratings as dependent variable, empathy subscale (categorical dummy variable coding for “empathic concern” vs “perspective taking” subscale of the IRI; Davis, 2006), trait score (score on the respective subscale), and study (fMRI vs replication study) as predictors, and trial type (observed pain vs observed no-pain) as control variables. According to the results, emotion ratings after observing painful stimulation compared with nonpainful stimulation were significantly predicted by individual differences in trait scores on the two subscales (main effect of trait score : χ2 = 4.57, p = 0.03, β = −0.27, SE = 0.13; trait score × trial type interaction: χ2 = 12.35, p < 0.001, β = 0.53, SE = 0.15) with no significant difference between the two empathy subscales (trait score × empathy subscale × trial type interaction: χ2 = 0.05, p = 0.82, β = −0.04, SE = 0.19). This means that the higher an individual scored on trait perspective taking and trait empathic concern, the worse he or she felt when observing stimulation of the other person (main effect of trait score). This was particularly true for those trials in which participants observed painful stimulation of the other (trait score × trial type interaction). This effect was even more pronounced in the behavioral replication study (trait score × trial type × study interaction: χ2 = 6.91, p = 0.009, β = −0.49, SE = 0.19). These results suggest that the manipulation successfully reinforced empathy on a trial-by-trial basis.
Empathy-related social closeness resists extinction
The main goal of the current studies was to understand how social closeness based on empathy develops over time in the two blocks and conditions. To this end, a linear mixed model was conducted with trial number (1–25), block (Block 1 vs Block 2), and condition (control vs treatment) as fixed effects and participant as random intercepts and trial number as random slope for participant. This analysis revealed that empathy-related closeness increased with trial number in all blocks and conditions [main effect of trial number (p < 0.001); see Table 2 for full results and Fig. 2A for visualization]. This effect, however, was not modulated by block and condition (trial number × block × condition interaction: p = 0.103, log(BF) = −7.41). Average closeness was larger in Block 2 than that in Block 1 (main effect of block: p < 0.001) and larger in the treatment than that in the control condition (main effect of condition: p < 0.001). Further, results showed a significant interaction between condition and block (p < 0.001) which was, however, not qualified by differential effects of condition in the two blocks (i.e., all 95% confidence intervals of the simple means based on the model largely overlapped, indicating no significant post hoc effects). In contrast to a hypothesized decay in social closeness in Block 2 of the treatment condition, post hoc t tests comparing the means of the last five trials in Block 1 and the mean of the last five trials in Block 2 revealed no significant difference in closeness (t(45) = −0.96; p = 0.344; log(BF) = −1.40), indicating sustained empathy toward another who is only rarely receiving painful stimulation. The corresponding analysis in the behavioral replication study replicated these results (main effect of condition: p < 0.001, main effect of trial number: p < 0.001; main effect of block: p < 0.001; condition × block interaction: p < 0.001; trial number × block × condition interaction: p = 0.400, log(BF) = −3.47), with the exception of a larger main effect of block in the behavioral replication study and a more pronounced interaction between condition and block number (block × study: χ2 = 4.67, p = 0.031, β = −0.07, SE = 0.03; block × condition × study: χ2 = 7.66, p = 0.006, β = 0.13, SE = 0.05; Fig. 2A,C). Again, post hoc t tests comparing the means of the last five trials in Block 1 and the mean of the last five trials in Block 2 revealed no significant difference in social closeness (t(26) = 1.29; p = 0.208; log(BF) = −0.85).
Mean empathy-related social closeness and results of Bayesian model comparison in the fMRI study (top) and the behavioral replication study (bottom). A, Mean social closeness in the fMRI study with model free trend line and pointwise 95% confidence interval (loess function) by block, condition, and trial number. Social closeness increased in Block 1 and plateaus/slightly increased in Block 2 in both conditions, demonstrating resistance to extinction of empathy-related social closeness. B, Bayesian model comparison of three models (see Fig. 3 for model space) revealed that individual recalibration of the learning signal associated with observing another’s pain versus no-pain was most likely to explain participants’ social closeness rating behavior. C, Replication of the behavioral pattern and (D) of the modeling comparison results in the behavioral replication study. See Extended Data Figures 2-1 and 2-2 for the link between trial-by-trial updates of social closeness and observed pain and Extended Data Figure 2-3 for absolute model fits and Extended Data Figure 2-4 for results to test the association between individual ω-values and trait empathy.
Figure 2-1
Generalized linear models to predict trial-by-trail changes in (A) social closeness and (B) emotion ratings. Results of the General Linear Model with the predictors outcome, block, and condition (along with their interactions). (A) The analysis showed a significant effect of experiment type on the relationship between outcome and closeness updating (F1,82 = 56.25, p < .0001). This indicates a differential relationship between outcome and closeness updating, which we confirm in follow-up analyses: Outcome is positively associated with changes in closeness for reciprocity (β = 0.13, S.E. = 0.02, t = 6.08, p < .0001) and at a slightly reduced level with marginal significance for empathy (β = 0.0090, SE = 0.0052, t = 1.67, p = .0997). Note that before we ran these analyses, we removed outlier values (i.e., coefficient values lower than 2.5% and higher than 97.5%), which could reflect inaccuracies in first level model estimates. However, we observe similar results without outlier removal for the ANOVA (F1,98 = 24.88, p < .0001), and follow-up tests (reciprocity: β = 0.13, S.E. = 0.02, t = 5.47, p < .0001; empathy: β = 0.01, S.E. = 0.01, t = 1.21, p = .2281). (B) We find a significant effect of experiment type on the relationship between outcome and emotion updating (F1,77 = 45.28, p < .0001). This indicates a differential relationship between outcome and emotion updating, which we confirm in follow-up analyses: outcome is positively associated with emotion updating for reciprocity (β = 0.34, S.E. = 0.06, t = 5.47, p < .0001), and negatively for empathy (β = -0.15, S.E. = 0.03, t = -4.59, p < .0001). Note that we observe similar results in analyses without outlier removal for the ANOVA (F1,98 = 61.43, p < .0001), and follow-up tests (reciprocity: β = 0.34, S.E. = 0.05, t = 6.37, p < .0001; empathy: β = -0.19, S.E. = 0.03, t = -5.52, p < .0001). con = condition, blo = block, outcome 1/0 = observed outcome (pain = 1, non-pain = 0). n.s.: not significant, ∼ : p < .1, *: p < .05, **:p < .01, ***:p < .001. Download Figure 2-1, TIF file.
Figure 2-2
Bayesian model comparison results for follow-up models. (A) fMRI and behavioral replication study (empathy) (B) reciprocity control study. Model 1 reflects results from reinforcement learning models with closeness ratings as a proxy for expected value, model 2 reflects results from reinforcement learning models with emotion ratings as a proxy for expected value. Download Figure 2-2, TIF file.
Figure 2-3
Visualization of absolute model fit (mean values and SEMs) for all studies and the three models from model space I. (A) fMRI study. The basic model (model 1, dark blue) starkly underestimates participants’ closeness behavior in the second block of the treatment condition, while the differential model (model 2, cyan) overestimates closeness behavior in the first block of the treatment condition. The winning model (individual calibration model, model 3, magenta) captures participants’ closeness ratings quite well, across both blocks of the treatment condition and the control condition. (B) This pattern is replicated in the independent behavioral replication study. (C) All three models capture participants’ behavior in the different blocks and conditions in the reciprocity control study comparably well. Download Figure 2-3, TIF file.
Figure 2-4
Results of the linear regression testing the relationship between individual ω-values and trait empathy. Empathy subscale (“perspective-taking” vs. “empathic concern”, trait score, and their interaction were included as predictors, and study (fMRI vs. behavioural replication) as control variable. (N = 73, 73 observations; max VIF = 5.78) VIF = variance inflation factor. Download Figure 2-4, DOCX file.
Effects on empathy-related social closeness
These results suggest that participants dynamically changed their social closeness in the acquisition block in which they observed pain stimulation of the other with high frequency and in the extinction block in which they observed pain stimulation of the other with low frequency (see Extended Data Fig. 2-1A,B for follow-up analysis linking trial-by-trial changes in social closeness and emotion ratings, respectively, to the trial-by-trial observation of pain vs nonpain). Next, we used computational modeling to clarify the underlying mechanisms.
Computational modeling of empathy-related social closeness
We tested which of three variants of the Rescorla–Wagner model best described the development of empathy-related social closeness (see Fig. 3 for visualization of the model space and Materials and Methods, Computational modeling for details). The first model (basic model; Fig. 3A) consisted of the basic Rescorla–Wagner model with one learning rate; the second model (differential model; Fig. 3B) allowed for a different learning rate in reinforced trials and nonreinforced trials; the third model included a recalibration parameter ω that directed the computation of the prediction error (individual calibration model; Fig. 3C).
Model space. A, Basic model. In the basic model, social closeness in the present trial SocialClosenesst depends on the closeness rating in the previous trial SocialClosenesst−1 and the learning rate α multiplied by the prediction error δ. This prediction error is computed as the difference between the reinforcer value in the current trial R (1 vs 0) and the closeness rating of the previous trial SocialClosenesst−1. B, Differential model. Same as the basic model except that alpha is different for reinforced (α+) and nonreinforced (α−) trials. C, Individual calibration model. Same as the basic model except that a recalibration parameter ω is added to the computation of the prediction error δ. That is δ = (R-ω)-SocialClosenesst−1 if R = 1 (reinforced trials = Rreinf) and δ = ω-SocialClosenesst−1 if R = 0 (nonreinforced trials = Rnon−reinf). See Extended Data Figures 3-1–3-5 for evaluations of an alternative model space that tested block- and/or condition-specific recalibration and Extended Data Figure 3-6 for evaluation of an alternative two-level modeling approach. SC, SocialCloseness.
Figure 3-1
Model space II. In order to more closely investigate the computational basis of empathy-related social closeness stability, we conducted Bayesian model comparison using a second model space. (A) The first model corresponds to the third model in model space I and allows for one general recalibration parameter. (B) The second model assumes different recalibration values for the treatment and the control condition. (C) The third model assumes different recalibration values for block 1 and block 2 in both conditions, and (D) the fourth model assumes different recalibration values for block 1 and 2 in the treatment condition and block 1 and 2 in the control condition. SC = SocialCloseness. Download Figure 3-1, TIF file.
Figure 3-2
Results of the Bayesian model comparison of model space II. (A) In the fMRI study, the model assuming condition and block specific recalibration is more likely than the other models in the model space in terms of exceedance probability as well as estimated model frequencies. (B) In the reciprocity control study, this pattern is replicated regarding both metrics. Model 1 = individual calibration model, model 2 = condition-specific recalibration model, model 3 = block-specific recalibration model, model 4 = condition and block specific recalibration model. Download Figure 3-2, TIF file.
Figure 3-3
Boxplots visualizing median and spread of the extracted ω parameters by block number and condition extracted from the winning model of model space II (condition and block wise recalibration model). (A) ω parameters resulting from modelling the behavior of the fMRI study. (B) ω parameters resulting from modelling the behavior of the replication study. Download Figure 3-3, TIF file.
Figure 3-4
Visualization of absolute model fit (mean values and SEMs) for the fMRI study and the behavioural replication study comparing the fit of the individual calibration model (model 1, dark blue) and the individual block- and condition-specific recalibration model (model 4, magenta) from model space II. (A) In the fMRI study, across both blocks of the treatment condition, model 4 better describes participants behavior as compared to model 1, especially in the treatment condition. (B) This pattern is replicated for participants’ behavior in the behavioral replication study. Download Figure 3-4, TIF file.
Figure 3-5
Results of the Bayesian model comparison of the four models in model space II for the fMRI study, behavioral replication study, and the reciprocity control study. Exceedance probabilities (EP) indicate the likelihood for a given model to have generated the data given the model space. Estimated model frequencies (EF) indicate the likelihood for a model to have generated the data of any randomly selected subject. The absolute value of the Laplace approximation to the model evidence (LAME) indicates how well a given model fits the empirical data taking model complexity into account. Lower values indicate better model fit. Where applicable mean values ± SEs are reported. Download Figure 3-5, DOCX file.
Figure 3-6
Results of the evaluation of a two-level model in comparison with the winning one-level model (individual calibration model). In the evaluation, using Bayesian model comparison we first identified an appropriate two-level model that links trial-by-trial expected value V with trial-by-trial social closeness (SocialCloseness = β0 + β1×V, with the intercept β0 and the regression weight β1), and compared this two-level model’s predictions of social closeness with the one-level model’s predictions, which highly correlated (r = 0.43 ± 0.04; t(45) = 10.95; p = 2.79*10−14). (A) Model recovery assessments based on 500 virtual participants using randomly sampled parameter values show that each model produced unique predictions and explains specific behavioral patterns. This is demonstrated by the model comparison results (AIC = Akaike Information Criterion and BIC = Bayesian Information Criterion), showing that both models perform well. Moreover, model recovery results indicate that each model produces a unique behavioral pattern that can be captured best by itself. (B) Visualization of the generative performance based on model simulations. Three parameters were estimated for the two-level model (basic Rescorla-Wagner model + regression model: β0, β1 and the learning rate δ) and two parameters for the one-level model (individual calibration model: ω and the learning rate δ). This allows for testing whether the estimated parameter values could reproduce learning performance observed in our data. Visually, the pattern of the one-level model (red lines) matches the pattern of the empirical data (black lines) slightly better than the two-level model (blue lines), particularly in block 1, as the two-level model predicts a very steep incline after block 1 and asymptotes very early, which is not observable in the data. (C) Bayesian model comparison results. The one-level model is the winning model with an exceedance probability of over 86% (probability that this model is more likely than all other models in the model space) and an estimated model frequency of 57% (probability that this model generated the data of any randomly selected participant). Download Figure 3-6, TIF file.
Bayesian model comparison (see Materials and Methods for details) revealed that in the fMRI study (Fig. 2B), the individual calibration model is the winning model with an EP of over 99% (probability that this model is more likely than all other models in the model space) and an estimated model frequency of 97% (probability that this model generated the data of any randomly selected participant). This result was replicated in the behavioral replication study (Fig. 2D). Additional follow-up analyses demonstrated that using social closeness as proxy for expected value was more likely to describe participants empathy-related behavior than the analogous model using emotion ratings as proxy for expected value (see Extended Data Fig. 2-2 for visualization). In the present paradigm, social closeness thus presents the most valid proxy for expected value.
The recalibration parameter ω
For empathy-related social closeness, the respective winning model included a recalibration parameter ω. The larger this parameter, the more likely are nonreinforced trials to elicit a positive prediction error and hence a positive updating of closeness. A large ω should thus entail less decay of social closeness in the extinction block than a small ω.
The recalibration parameter ω was initially estimated across all blocks and conditions as one variable characterizing each individual. Exploratory analyses that tested the relationship between the individual ω values and individual scores of the empathic concern and the perspective taking subscales (IRI; Davis, 1983) revealed no significant results (all ps > 0.32; Extended Data Fig. 2-4).
To more specifically test whether strong recalibration was specific to the extinction block, we assessed four additional models in which ω was free to vary by block, by condition, or both. Bayesian model comparison showed that the model allowing for block-specific as well as condition-specific estimations of ω best described participants’ behavior (see Extended Data Fig. 3-2 for visualization of the Bayesian model comparison results, Extended Data Fig. 3-5 for comparison metrics, and Extended Data Fig. 3-4 for visualization of absolute model fit). Analysis of the individual ω showed that on average, participants more strongly recalibrated in the extinction block than in the acquisition block, that is, on average ω was larger in the extinction block than that in the acquisition block (fMRI study: T(45) = 2.753, p = 0.009, CI = [0.345, 0.054]); replication study (T(26) = 2.0, p = 0.056, CI = [−0.005, 0.384]), but recalibration values did not significantly differ between Block 1 and Block 2 for the control condition (fMRI study: T(45) = −0.579, p = 0.568, CI = [−0.139, 0.077]; replication study: T(26) = −1.027, p = 0.314, CI = [−0.176, 0.059]; for visualization of the median and spread of the extracted parameters, see Extended Data Fig. 3-3). These results indicate that social closeness resisted extinction, because in the extinction block participants updated social closeness based on the observation of no-pain trials, that is, trials that were defined as nonreinforcers and now elicited positive prediction errors (captured by ω). In contrast, in the acquisition block, participants’ changes in empathy-related closeness were driven by observing pain in the other, that is, the event that was originally defined as the reinforcer.
Reciprocity control study
So far, our results revealed that the sustained nature of empathy-related closeness can be understood in terms of the recalibration of the outcome value associated with observing another’s pain versus nonpain. To test if this recalibration of feedback used to update social closeness is a general phenomenon or specifically related to empathy, we conducted a behavioral reciprocity control study using the identical experimental design to test the formation and stability of reciprocity-based closeness. Reciprocity, commonly defined as returning a previously given or an anticipated favor (Gouldner, 1960; McCabe et al., 2003; Hein et al., 2016a), is one of the most important social norms worldwide (Axelrod and Hamilton, 1981; Perugini et al., 2003; Falk and Fischbacher, 2006; Nowak, 2006). Similar to empathy, reciprocity can increase closeness (Neyer et al., 2011; Adams and Miller, 2022) and is a strong motivator of prosocial behavior (Fehr et al., 2002). However, whereas empathy-related closeness and prosociality is elicited by sharing the emotions of the other, reciprocity-based processes are conditional on the other’s behavior, that is, reflect a “tit-for-tat” principle rather than shared emotions (Dufwenberg and Kirchsteiger, 2004; Rand et al., 2009; Zaki, 2014; Eccles et al., 2020). Hence, to reinforce reciprocity in the present paradigm, the participant received help from the other person, that is, the other person gave up a monetary reward to save the participant from pain, a procedure that has been established for enforcing direct positive reciprocity toward the helper (Hein et al., 2010; Saulin et al., 2022). Crucially, the trial structure and the assessment of social closeness was identical to the trial structure in the two empathy studies outlined above (Fig. 1C).
Manipulation check
Analogously to the empathy studies above, we first analyzed participants’ emotion ratings to test our manipulation. Results of a linear mixed model revealed a main effect of trial type (χ2 = 62.89; p < 0.001; β = −1.06; SE = 0.13) for the reciprocity motive. Thus, participants emotionally distinguished between those trials in which the partner had decided to help them versus decided not to help them. Moreover, there was a significant relationship between participants’ emotion ratings and their scores for positive reciprocity on the trait reciprocity scale (Perugini et al., 2003; χ2 = 4.34; p = 0.037; β = 0.26; SE = 0.12), confirming that our paradigm successfully reinforced positive reciprocity.
Reciprocity-related social closeness can be extinguished
To analyze the development of reciprocity-related social closeness over time, we conducted a linear mixed model with trial number, block, and condition as fixed effects, participant as random intercept, and trial number as random slope for participant. This analysis revealed a significant three-way interaction of condition, trial number, and block (p < 0.001), which shows that the development of social closeness over time differentially depended on the block as well as the condition (see Fig. 4A for visualization and Table 3 for full results). As such, reciprocity-related social closeness was affected significantly by reinforcement frequency: in the treatment condition (Fig. 4A, dark lines), social closeness increased when strongly reinforced during the acquisition block (simple slope: β = 0.02; 95% interval = [0.01, 0.03]) and decayed when weakly reinforced during the extinction block (β = −0.05; 95% interval = [−0.06, −0.04]), while in the control condition (Fig. 4A, light lines) where reinforcement remained at chance level in Block 1 and Block 2, little change in social closeness ratings was observed (Block 1: β = −0.01, 95% interval = [−0.02, −0.004]; Block 2: β = −0.007, 95% interval = [−0.02, 0.001]; see Extended Data Fig. 2-1 for follow-up analysis linking that trial-by-trial changes in social closeness to the trial-by-trial observation of pain vs nonpain).
Behavioral pattern and Bayesian model comparison results of the reciprocity control study. A, Mean social closeness with model free trend line and pointwise 95% confidence interval (loess function) by block, condition, and trial number. Social closeness increased in Block 1 of the treatment condition (acquisition) and starkly decreased in Block 2 (extinction), demonstrating no resistance to extinction of reciprocity-related social closeness. B, Bayesian model comparison of three models (see Fig. 3 for model space) revealed that the basic model assuming simple updating directly based on the learning signal and individual recalibration of the learning signal associated with observing another’s help versus no help are equally likely to explain participants’ reciprocity-related social closeness rating behavior.
Effects on reciprocity-related social closeness
Computational modeling of reciprocity-related social closeness
Bayesian model comparison conducted analogously to the fMRI and the behavioral replication study revealed that in the reciprocity control study, the basic model is quite likely to have generated the data as well as the individual calibration model (Fig. 4B; see Table 4 for overview of model comparison metrics and Extended Data Fig. 2-3C for visualization of absolute model fit). Hence, in contrast to empathy-related social closeness formation and stability, the temporal evolution of reciprocity-related social closeness can also be well captured by a simple learning rule. This is in line with the decrease in social closeness when the frequency of helping declined in the extinction block, that is, despite rare helping of the interaction partner.
Results of the Bayesian model comparison of the three models in model space I for the three studies
Imaging results
The behavioral results revealed that empathy-related social closeness, in contrast to reciprocity-related social closeness, is robust against extinction, as individuals recalibrate the outcome value associated with observing the other person receive painful versus nonpainful stimulation. Moreover, results from computational modeling indicate that the outcome value of no-pain trials (nonreinforced trials) are associated with positive outcome values and are thus likely to lead to positive prediction errors, enabling an increase in empathy-related social closeness based on nonreinforced trials.
In a next step, we investigated the neural mechanisms underlying the observed stability of empathy-related closeness. As a manipulation check, we first analyzed the neural activation after observing painful or nonpainful stimulation in the treatment and the control condition as indicator of neural sensitivity to reinforced (painful) as compared with nonreinforced trials (nonpainful). Focusing on the emotion rating phase, a regression analysis with the parametric modulator trial type (painful vs nonpainful) revealed an increased activation for the processing of observed painful stimulation in the IFG/right aIns [peak coordinates: x = 38, y = 28, z = −4, p(whole-brain FWE-cluster-corrected) = 0.033, k = 143], the bilateral TPJ (x = −52, y = −52, z = 20, T(44) = 6.21, p < 0.001, k = 898; x = 62, y = −48, z = 22, T(44) = 4.74, p < 0.001, k = 532), that is, regions that have been associated with empathy (Fig. 5A; Extended Data Fig. 5-1), as well as the right occipital pole (peak coordinates: x = 16, y = −92, z = 8, p = 0.005, k = 214).
Neural responses to observed pain versus nonpain in the emotion rating and the closeness rating phases. A, Emotion rating phase: significant results in the right IFG/aIns and bilateral TPJ. B, Closeness rating phase: significant results in the dorsomedial prefrontal cortex, ventral striatum, dorsal striatum, and IFG. For visualization purposes, maps were thresholded at p < 0.001 uncorrected with cluster size k ≥ 50. IFG, inferior frontal gyrus; aIns, anterior insula; TPJ, temporoparietal junction; dmPFC, dorsomedial prefrontal cortex; VS, ventral striatum; DS, dorsal striatum. See Extended Data Figure 5-1 for all results at p < 0.001 uncorrected and k > 50 voxels.
Figure 5-1
Results of the second-level analysis with the contrast parametric modulator trial type during the emotion rating phase and the closeness rating phase. This analysis shows which neural regions are more active when participants observed others in pain compared to observing non-pain in all blocks and conditions. P < .001 uncorrected, k > 50. Reported coordinates are in MNI space. Download Figure 5-1, DOCX file.
Conducting the same analysis with the data from the closeness rating phase revealed activations in right dorsomedial prefrontal cortex [peak coordinates: x = 18, y = 54, z = 32, T = 4.66, p(whole-brain FWE-cluster-corrected) < 0.001, k = 490], right ventral and dorsal striatum [peak coordinates: x = 8, y = 12, z = 4, T = 4.15, p(whole-brain FWE-cluster-corrected) = 0.009, k = 190; peak coordinates: y = 30, y = 2, z = −8, T = 5.20, p(whole-brain FWE-cluster-corrected) = 0.032, k = 143], and IFG [peak coordinates: x = 42, y = 28, z = −12, T = 4.49, p(whole-brain FWE-cluster-corrected) < 0.001, k = 408; Fig. 5B; Extended Data Fig. 5-1]. The respective analyses with the feedback phase revealed activation in the right aIns, but only at an uncorrected threshold (p < 0.001).
Contrasting the parametric modulators of the treatment and the control condition revealed no significant results for the emotion rating phase, which is expected given that on average participants observed the same number of pain trials in both conditions. For the closeness rating phase, this analysis showed a network of brain regions with the strongest effect in the lateral prefrontal cortex and the precentral sulcus (see Extended Data Fig. 6-4 for full results at p < 0.001 uncorrected and k > 50).
Based on the modeling results reported above, the neural effects in the treatment condition in contrast to the control condition should be modulated by the recalibration parameter, that is, the parameter that prevented a decline of empathy-related closeness in the extinction block. This effect should be strongest in the empathy rating phase, because this is the phase that is most clearly linked to the explicit empathic reaction.
To test this, using a second-level regression, we regressed the individual ω parameter against the treatment versus control contrast of the parametric modulator for observed pain in the emotion rating phase (see analysis above). This analysis reveals regions in which increasing feedback recalibration is related to increasing neural responses to pain trials compared with nonpain trials in the treatment versus the control condition. The results showed significant activations in the left IFG extending into aIns (peak coordinates: x = −32, y = 16, z = 18, t(44) = 4.73, p = 0.001, k = 269; Fig. 6A, top panel) and the bilateral STS/TPJ (left hemisphere peak coordinates: x = −66, y = −26, z = 0, t(44) = 5.62, p < 0.001, k = 517; right hemisphere peak coordinates: x = 60, y = −16, z = 10, t(44) = 6.56, p < 0.001, k = 471; Fig. 6A, bottom panel; see Extended Data Fig. 6-1 for full results).
The neural responses to observing painful and nonpainful stimulation in others are modulated by the recalibration of the feedback signal (ω) and predict individual changes in social closeness. A, Regressing the recalibration parameter (ω) against the neural differences in emotion rating-related responses between the treatment and the control condition revealed significant results in the left IFG and adjacent anterior insula (IFG/aIns; top panel) and the bilateral STS/TPJ (bottom panel). B, Relationship between participants’ trial-wise social closeness ratings (averaged over acquisition and extinction as block had no differential effect) and neural sensitivity to other’s pain [i.e., the extracted beta values from the parametric modulator trial type for acquisition (treatment Block 1) and extinction (trial type treatment Block 2), respectively] in IFG/aIns (top panel) and, STS/TPJ (bottom panel). For visualization purposes, maps were thresholded at p < 0.001 uncorrected with cluster size k ≥ 50. STS, superior temporal sulcus; TPJ, temporoparietal junction; IFG, inferior frontal gyrus; aIns, anterior insula. See Extended Data Figures 6-1 and 6-4–6-6 for all results at p < 0.001 uncorrected and k > 50 voxels for all task phases and see Extended Data Figures 6-2 and 6-3 for regression results as depicted in B.
Figure 6-1
Results of the second-level regression with the contrast parametric modulator trial type treatment > parametric modulator trial type control during the emotion rating phase and individual recalibration parameters ω as covariate. This analysis shows for which neural regions increased neural sensitivity to another’s pain in the treatment condition compared to the control condition is linked to individual recalibration. p < .001 uncorrected, k > 50. Reported coordinates are in MNI space. Download Figure 6-1, DOCX file.
Figure 6-2
Results of the linear mixed model analyses with the beta estimates from IFG/ aIns during the emotion rating phase (reflecting the neural sensitivity to another’s pain, see Fig. 6-1), block (acquisition vs. extinction), empathy subscale (empathic concern vs. perspective-taking), trait score, and trial type (observed pain vs. observed non-pain) as fixed effects, participant as random intercept, and trial-by-trial closeness ratings as dependent variable. (N = 46, 4600 observations). χ² and P(χ²) are the type 3 Wald χ² test statistics. aIns = anterior insula; IFG = inferior frontal gyrus. Download Figure 6-2, DOCX file.
Figure 6-3
Results of the linear mixed model analyses with the beta estimates from STS/TPJ during the emotion rating phase (reflecting the neural sensitivity to another’s pain, see Fig. 6-1), block (acquisition vs. extinction), empathy subscale (empathic concern vs. perspective-taking), trait score, and trial type (observed pain vs. observed non-pain) as fixed effects, participant as random intercept, and trial-by-trial closeness ratings as dependent variable. (N = 46, 4600 observations). χ² and P(χ²) are the type 3 Wald χ² test statistics. STS = superior temporal sulcus; TPJ = temporo-parietal junction. Download Figure 6-3, DOCX file.
Figure 6-4
Results reflecting the differential effect of observing others in pain (vs non-pain) in the treatment compared to the control condition during the closeness rating phase. This analysis tests the interaction between observing others in pain (modelled via the trial type parametric modulator) and treatment condition, identifying regions that respond stronger to observing the other’s pain in the treatment condition compared to the control condition. p < .001 uncorrected, k > 50. Reported coordinates are in MNI space. dlPFC = dorsolateral prefrontal cortex, dmPFC = dorsomedial prefrontal cortex. Download Figure 6-4, DOCX file.
Figure 6-5
Results of separate second-level regressions in the feedback and the closeness phases testing the relationship between the treatment vs control contrast of the parametric modulator for pain and the individual recalibration parameter ω. This analysis shows regions that show significant modulation by the individual recalibration ω for the interaction contrast reflecting increased neural sensitivity to another’s pain in the treatment condition compared to the control condition. p < .001 uncorrected, k > 50. Reported coordinates are in MNI space. Download Figure 6-5, DOCX file.
Figure 6-6
Results of the linear mixed model analyses with the beta estimates from angular gyrus, IFG, middle temporal gyrus, and precuneus during the closeness rating phase (reflecting the neural sensitivity to another’s pain, see Fig. 6-4), block (acquisition vs. extinction), empathy subscale (empathic concern vs. perspective-taking), trait score, and trial type (observed pain vs. observed non-pain) as fixed effects, participant as random intercept, and trial-by-trial closeness ratings as dependent variable. (N = 46, 4600 observations). χ² and P(χ²) are the type 3 Wald χ² test statistics. ANG = angular gyrus, IFG = inferior frontal gyrus, MTG = middle temporal gyrus, PCUN = precuneus. Download Figure 6-6, DOCX file.
Respective analyses for the closeness rating phase showed significant results in the precuneus [peak: x = −4, y = −56, z = 56, T = 4.62, p(whole-brain FWE-cluster-corrected) < 0.001, k = 598 voxels], supramarginal gyrus/angular gyrus [peak: x = −52, y = −52, z = 44, T = 4.96, p(whole-brain FWE-cluster-corrected) = 0.044, k = 140 voxel], and IFG [peak: x = −38, y = 40. z = 0, T = 4.82, p(whole-brain FWE-cluster-corrected) = 0.048; see Extended Data Fig. 6-5 for full results at p < 0.001 uncorrected and k > 50]. In the feedback phase, we observed significant results in the supramarginal gyrus [peak coordinates: x = −56, y = −28, z = 32, T = 4.63, p(whole-brain FWE-cluster-corrected) = 0.003, k = 236; see Extended Data Fig. 6-5 for full results at p < 0.001 uncorrected and k > 50].
In a final step, we tested if neural regions associated with the recalibration of the feedback signal (ω parameter) observed above were related to changes in social closeness. To do so, we extracted the beta estimates from the entire clusters of activation observed in the second-level regression separately for the acquisition and the extinction blocks and tested their relationship with social closeness ratings. Focusing on the emotion rating phase, first we entered the IFG/aIns and STS/TPJ beta estimates in separate models, together with trial type (observed pain vs observed nonpain) and block (acquisition vs extinction) as predictors and trial-by-trial social closeness ratings in the respective blocks as dependent variable. Given that trait empathy influenced the emotional reactions on the behavioral level, the empathic concern and perspective taking subscales (IRI; Davis 1983) were added as continuous (trait scores) and categorical (subscale empathic concern vs perspective taking) control variables. Results revealed significant interactions between IFG/aIns beta estimates × trial type (χ2 = 5.64, p = 0.018, β = −0.18, SE = 0.07; Fig. 6A) and STS/TPJ × trial type (χ2 = 6.43, p = 0.011, β = −0.08, SE = 0.03; Fig. 6B), reflecting a stronger effect of neural recalibration on social closeness ratings when observing nonpain trials compared with pain trials (Extended Data Fig. 6-2 for full results). No other effects reached significance (all ps > 0.246).
Analogous analyses with the beta estimates that were extracted from the significant activations during the closeness rating phase and the feedback phase revealed no significant interaction effects with trial type (see Extended Data Fig. 6-6 for full results).
Discussion
Here we present the results of two independent studies, showing how empathy-related social closeness is formed and preserved. Using computational modeling, we reveal that empathy-related social closeness is learned if participants repeatedly and frequently observe another person receiving pain. Importantly, the learned empathy-related social closeness persisted even if the other person is no longer facing frequent pain. This means that social closeness that was generated “in bad times,” that is, by empathy with the misfortune of another person, is transferred to “good times” in which the other person feels well again.
The computational modeling approach in which we tested different extensions of a standard reinforcement learning model provided insights into the learning mechanisms that allowed for the transition of empathy-related social closeness from “bad times” to “good times.” First, our modeling results revealed that the maintenance of empathy-related social closeness contradicts the assumptions of basic reinforcement learning models. According to these models, empathy-related social closeness should decay if empathy is no longer reinforced. In contrast to this assumption, our data showed that social closeness ratings remained high, even when the participants hardly observed painful stimulation of the other, that is, the event that had induced empathy-related closeness in the first place. Instead, we found that after they learned empathy-related closeness based on observing pain, participants maintained this social closeness by now learning from positive events (lack of pain) for the other as well. At the computational level, this change in feedback used for learning was captured by a recalibration parameter (ω) which influences the likelihood that formerly nonreinforced trials (here nonpain trials) can elicit a positive prediction error and thus learning. The recalibration of the learning feedback signal linked to the extinction resistance of empathy-related social closeness in our study is in keeping with previous studies that showed that the feedback value is susceptible to different learning contexts and can be individually adjusted (Bavard et al., 2018; Pischedda et al., 2020; Hunter and Daw, 2021). The type of context is not decisive as it can take on different forms, such as outcome valence and magnitude (Bavard et al., 2018), uncertainty of reward in a given environment (Hunter and Daw, 2021), or the richness of feedback provided (Pischedda et al., 2020). Extending this previous work, our findings showed that social closeness can be learned from two opposing social feedback signals, that is, the feedback that another person is in danger (pain) or the feedback that another person is safe (no longer suffering pain). Given evidence that observing others’ pain elicits empathy for pain (Singer et al., 2004; Lamm et al., 2009; Singer and Klimecki, 2014; Hein et al., 2016a) and observing others receiving rewards elicits empathic joy (Batson et al., 1991; Andreychik, 2019), our findings suggest that the formation of empathy-related closeness (captured by the processes in the acquisition block) is related to empathy for pain, while the maintenance of empathy-related closeness (captured by the processes in the extinction block) is related to empathic joy.
On the neural level, the maintenance of empathy-related closeness was related to activation in bilateral STS/TPJ and left IFG/aIns, regions that have been associated with cognitive and affective empathy, respectively. The larger an individual’s estimated recalibration parameter was, the more sensitive was the neural activation in these regions in response to another’s pain versus nonpain across all blocks and condition. Follow-up analyses showed that in the treatment condition, the stronger the neural activation in response to another’s nonpain versus pain, the closer participants felt to the other in trials of observed nonpain as compared with trials of observed pain. This suggests that differential neural sensitivity to observed pain and observed nonpain is linked to the stability of empathy-related social closeness.
Underlining the robustness of our results, the finding of sustained empathy-related social closeness and the underlying computational mechanism that we obtained in the fMRI study were replicated in an independent study in the laboratory. Moreover, the specificity of our findings is highlighted by results of a reciprocity control study. According to the results of the reciprocity control study, social closeness can also be induced by the social norm of reciprocity. Importantly, however, reciprocity-related social closeness, as opposed to empathy-related social closeness, decays rather quickly despite using the same gradual extinction procedure as for empathy. Specifically, participants showed a learning-related decrease in social closeness if the other person stopped to behave in a reciprocity-evoking manner (reflected by the decrease in closeness ratings in the extinction block). The learning-related changes in reciprocity-based social closeness were well captured by a basic reinforcement learning model without recalibration for a large portion of the participants. In contrast, the model of empathy-related closeness required a recalibration parameter to capture the consistently high closeness ratings in the extinction block.
Inspired by previous work, empathy was induced by observing others in pain (Xu et al., 2009; Beeney et al., 2011; Marsh, 2018), that is, an outcome that is relevant for others, and reciprocity was induced by being saved from pain (Gouldner, 1960; McCabe et al., 2003; Hein et al., 2016b; Saulin et al., 2022), that is, an outcome that is immediately relevant for the participants. Both manipulations induced a comparable extent of social closeness, but, according to our modeling results, the underlying mechanisms were different. The differences in mechanisms are in line with previous works, suggesting that other-relevant outcomes (here related to the induction of empathy) and self-relevant outcomes (here related to the induction of reciprocity) can be computationally distinguished (Lockwood et al., 2016; Cutler et al., 2020; Fornari et al., 2020; Golubickis and Macrae, 2022). In more detail, there is previous evidence showing that self-relevant information is computationally favored (Lockwood et al., 2016; but see Cutler et al., 2020; Golubickis and Macrae, 2022 for other evidence). Extending previous works, our results show that people learn differently from self-relevant compared with other-relevant information also in the social domain. A self-related learning bias may hence contribute to the efficient unlearning of social closeness in the reciprocity control study.
To exclude cross-gender effects, which are likely to occur if female participants interact with male confederates and vice versa, we only tested females. Previous studies have observed that empathic responses on the behavioral and the neural level may differ between men and women (Christov-Moore et al., 2014; Bluhm, 2017). Thus, the present findings may not directly translate to male participants. Future studies are required to show if our results generalize to male participants.
Across the three studies, all experimental parameters were kept constant (e.g., the reinforcer rates in each block) to optimize for comparability. However, for reciprocity, different parameters may be optimal with respect to the formation and stability of social closeness. That said, future studies should test the longevity of reciprocity-related social closeness using a paradigm optimized for reciprocity.
In conclusion and bearing in mind these limitations, the presented results show that empathy-related social closeness, generated in “bad times,” transfers to “good times.” It has been proposed that empathy is the glue that holds relationships and societies together (Witenberg and Thomae, 2016; Calloway-Thomas et al., 2017). The present study provides evidence for the longevity of empathy-related social closeness and reveals the underlying computational and neural mechanisms that may explain why empathy can lead to stable personal and societal relationships.
Footnotes
We thank the students who acted as confederates and assisted in data collection. This work was supported by the German Research Foundation (HE 4566/5-1; HE 4566/3-2) to G.H. and by a PhD fellowship by the German Academic Scholarship Foundation awarded to A.S. J.B.E. gratefully acknowledges support from Amsterdam Brain and Cognition. C.-C.T. was supported by GSSA, MOE Taiwan Scholarship (1081007012).
The authors declare no competing financial interests.
- Correspondence should be addressed to Anne Saulin at saulin_a{at}ukw.de.