Abstract
An organism's behavior is sensitive to different reinforcements in the environment. Based on extensive animal literature, the reinforcement sensitivity theory (RST) proposes three separate neurobehavioral systems to account for such context-sensitive behavior, affecting the tendency to react to punishment, reward, or goal-conflict stimuli. The translation of animal findings to complex human behavior, however, is far from obvious. To examine whether the neural networks underlying humans' motivational processes are similar to those proposed by the RST model, we conducted a functional MRI study, in which 24 healthy subjects performed an interactive game that engaged the different motivational systems using distinct time periods (states) of punishment, reward, and conflict. Crucially, we found that the different motivational states elicited activations in brain regions that corresponded exactly to the brain systems underlying RST. Moreover, dynamic causal modeling of each motivational system confirmed that the coupling strengths between the key brain regions of each system were enabled selectively by the appropriate motivational state. These results may shed light on the impairments that underlie psychopathologies associated with dysfunctional motivational processes and provide a translational validity for the RST.
Introduction
Many animal studies suggest that an organism's behavior is modulated by their individual sensitivity to different reinforcers (for review, see Cardinal et al., 2002). Gray and McNaughton (2000) formulated these animal findings into a single comprehensive and influential model: the reinforcement sensitivity theory (RST), which attempts to account for the observed variability in an organism's behavior. This model includes three major motivational systems mediating behavioral responses to environmental stimuli: (1) the fight-flight-freeze system (FFFS), which mediates sensitivity to aversive stimuli such as punishment, to yield defensive approach (fight) or defensive avoidance (flight, freeze) behaviors; (2) the behavioral activation system (BAS), which mediates sensitivity to appetitive stimuli such as reward to facilitate approach behavior; and (3) the behavioral inhibition system (BIS), which is sensitive to goal-conflict situations such as stimuli of mixed or ambiguous values, and produces adaptive behavioral selection while inhibiting alternative plans.
Findings from animal studies have also identified three separate neural networks underlying the proposed motivational systems. Specifically, the FFFS, mediating response to aversive stimuli, is thought to involve activation of the pery aquaductal gray, medial hypothalamus, central amygdala, and subgenual anterior cingulate cortex (sgACC). The BAS, mediating response to appetitive stimuli, is suggested to involve activation of the ventral tegmental area (VTA), nucleus accumbens (NAcc) and dorsomedial prefrontal cortex (dmPFC). Finally, the BIS is thought to rely on activations in the septohippocampal system and the ventromedial prefrontal cortex (vmPFC).
Although the RST model is appealing as a theoretical framework, and is supported by extensive empirical findings, it was established solely on evidence from animal studies, and thus its generalization to humans is not obvious (Smillie et al., 2007). Recent findings from functional MRI (fMRI) studies suggest that several functions assigned by the RST to specific brain regions are relevant in humans as well, especially the amygdala in response to aversive stimuli and the VTA and NAcc in response to appetitive stimuli (O'Doherty et al., 2002; Phelps, 2006; Haber and Knutson, 2010). Nevertheless, the translation of reinforcement paradigms to human study remains incomplete (Avila et al., 2008).
The aim of the current study was to simultaneously probe the neural networks that underlie human motivational processes as they function in real-life situations and to examine whether they correspond to those proposed by the RST model. To achieve this aim, we used an interactive game that included distinct expressions of relevant motivational states and that has been previously shown to elicit increased amygdala response to punishment and NAcc response to reward (Kahn et al., 2002; Assaf et al., 2009; Admon et al., 2012). We analyzed the individual pattern of neural responses to these stimuli with dynamic causal modeling (DCM) (Friston et al., 2003). This analysis allowed us to estimate the strength of coupling within each neuronal network or subgraph corresponding to a motivational system and, crucially, how these connection strengths were modulated by the different motivational states of punishment, reward, and goal-conflict.
Materials and Methods
Participants.
We studied 24 healthy 18-year-old participants (12 males). All were right-handed, had no reported history of psychiatric or neurological disorders, no current use of psychoactive drugs, and no family history of major psychiatric disorders. The protocol was approved by the Tel Aviv Sourasky Medical Center Ethics Committee. All participants provided written informed consent before participation.
fMRI paradigm.
Participants played a two-player competitive domino game where the opponent's responses were randomly generated by a computer in a predetermined pattern to allow a balanced design. Players, however, were told that the opponent was the experimenter and that their choices could increase their chances of winning. At the beginning of each game, 12 random domino chips were assigned to the player and were shown on the bottom part of the board, while one master domino chip, which remained constant throughout the game, appeared on the top left corner of the board. Players won the game if they were able to successfully dispose of all 12 chips within 4 min. Each assigned chip could either match the master chip (have one of the master chip's numbers) or not. In each round of the game, players had to choose one chip (i.e., decision making), place it face down adjacent to the master chip (i.e., execution), and then wait for the opponent's response (i.e., anticipation) to see whether the opponent challenged this choice by uncovering the chosen chip or not (i.e., outcome). Since the master chip remained constant throughout the game, it was only possible to win by choosing both matching and nonmatching chips. In the game context, matching chips are considered safe moves, since they are associated with rewards if uncovered and nonmatching chips are considered risky moves, since they are associated with punishments if uncovered. Specifically, based on the player's choice and opponent's response, there are four possible consequences per game round (i.e., outcome possibilities): (1) show of a nonmatch chip: the choice of a nonmatch chip is exposed and the player is punished by being given the selected chip back, plus two additional chips from the deck; (2) no show of a nonmatch chip: the choice of a nonmatch chip remains unexposed and only the selected chip is disposed of, so the player is not punished; (3) show of match chip: the choice of a match chip is exposed and the player is rewarded by disposal of the selected chip and one additional random chip from the game board; and (4) no show of a match chip: the choice of a match chip is not exposed and only the selected match chip is disposed of, so the player is not rewarded. Overall, player's choices and opponent's responses are interactively determined by the flow of the game round after round, creating a natural progression of the game situation that lasts 4 min or until the player wins. Each player played consecutively for 14 min (average number of games ± SEM: 4.23 ± 0.06). For more details of the game, see Figure 1, as well as Kahn et al. (2002).
Domino game paradigm. Each round of the game was composed of four intervals: the player chose which chip to play next (first interval: “Choose”; 4 s), moved the cursor to the chosen chip, and placed it face down adjacent to the master chip (second interval: “Ready” and “Go”; 4 s). The player then waited for the opponent's response (third interval: “Anticipation”; jittered randomly to 3.4, 5.4, or 7.4 s), and saw whether the opponent challenged this choice by uncovering the chosen chip or not (fourth interval: “Outcome”; jittered randomly to 3.4, 5.4, or 7.4 s). The player's choices and opponent's responses were interactively determined by the flow of the game round after round, creating a natural and unpredictable progression of a game situation that lasted 4 min or until the player won. Each player played consecutively for 14 min (average number of games ± SEM: 4.23 ± 0.06).
Motivational states.
To probe the neural representation of FFFS, we contrasted the brain's response to punishment (and non-reward) outcomes to the response to reward (and non-punishment) outcomes in the game (i.e., opponent's “show” following a player's nonmatch choice or “no-show” following a match choice vs opponent's “show” following player's match choice or “no-show” following a nonmatch choice). To probe the neural representation of BAS, we contrasted the opposite responses of reward versus punishment (with reward and punishment denoting the same events as in FFFS). The goal-conflict during the “Choose” time interval, in which players were required to decide between the safe, possibly rewarding chip, and the risky, possibly punishing one, was used to probe the neuronal representation entailed by the BIS. Events in which there was no real conflict are rare yet possible (i.e., if only matching chips or only nonmatching chips remained to choose from). When such events occurred, we excluded them by adding a regressor to the general linear model (<2% of our data).
Notably, despite the known theoretical difference between reward and non-punishment, a previous study has shown that in this game, subjects perceive the two states as similarly rewarding; this was also the case for the punishing value of punishment and non-reward (Assaf et al., 2009). To test for differences in the activation level between the different states, we conducted paired t tests comparing the BOLD percentage signal change (PS) of the two states in each relevant region of interest (ROI; i.e., ROI's defined in the DCM model as receiving input from each state). These t tests indeed revealed no difference in PS for any of the chosen ROIs in their relevant states: for the amygdala there was no difference in PS between punishment and non-reward states (t(23) = 0.17, p = 0.86) and for the NAcc there was no difference in PS between reward and non-punishment states (t(23) = 0.90, p = 0.37). These findings suggest that, at the neural level, the two different states within reward or punishment elicited similar activation during this paradigm.
fMRI data acquisition.
MRI scans were acquired on a 3.0T MRI scanner (Signa EXCITE; GE Healthcare) with a standard eight-channel head coil using gradient echo-planar imaging sequence of functional T2*-weighted images (TR, 2500 ms; TE, 35 ms; flip angle, 90°; FOV, 20 × 20 cm; matrix size, 64 × 64) divided into 44 axial slices (thickness, 3 mm with no gap) covering the entire brain.
fMRI data analysis.
Statistical Parametric Mapping (SPM5; Welcome Department of Imaging Neuroscience, London, UK) and Marsbar toolbox were used with Matlab 7.6 (MathWork). Preprocessing of functional scans included slice timing and head-movement correction, normalizing the images to MNI space, and finally spatially smoothing the data (FWHM, 6 mm). In addition, a set of harmonics was used to account for low-frequency noise in the data (1/128 Hz), and the first six images of each functional scan were rejected to allow for T2* equilibration effects. SPM8 was used with Matlab 7.6.0.324 for the DCM analysis.
Functional identification of regions of interest.
The size of the effect for each state for each participant was computed using a general linear model (GLM). We included four regressors in the design matrix that corresponded to four stimulus functions convolved with a canonical hemodynamic response function. Three of the GLM regressors encoded our predetermined motivation estates (i.e., punishment, reward, and conflict), while a fourth model encoded nonspecific effects common to all trials (i.e., common effects). Individual statistical parametric maps were calculated for contrasts of interest testing for the effect of punishment, reward, and conflict. These individual statistical parametric maps were used primarily to identify subject-specific ROI for subsequent effective connectivity analyses using DCM. Specifically, functional time series of the BOLD signal were extracted from each ROI in a subject-specific fashion by placing a sphere of 15 voxels around the individual peak activation of each subject within a group ROI.
DCM analysis.
For general information regarding DCM methods, see Friston et al. (2003). In our analyses, we defined four networks (i.e., subgraphs) corresponding to the motivational systems suggested by RST. Within each subgraph, the coupling architecture was the same but differed in terms of which motivational effects changed connection strengths and directly drove regional responses. We identified a significant modulation or enabling of each and every connection by motivational state (punishment, reward, conflict, or common effects) using Bayesian model comparison. Specifically, for the FFFS, we defined bidirectional effective connectivity between the amygdala, hypothalamus, and sgACC with modulatory effects on all connections and direct input to the hypothalamus and amygdala (see Fig. 3B.I). For the BAS, we defined bidirectional effective connectivity between the NAcc and dmPFC, with modulatory effects on the NAcc–dmPFC connection and direct input to NAcc (see Fig. 3B.II). For the BIS, we defined bidirectional effective connectivity between the hippocampus and vmPFC, with modulatory effects on both connections and direct input to the hippocampus (see Fig. 3B.III). Model comparison was performed separately for each motivational subgraph with a fixed effects Bayesian model selection procedure (Penny et al., 2010). Model selection was based on a free energy approximation to model log evidence (Kass and Raftery, 1995), which was used to compute log Bayes factors or log evidence differences. Following Stephan et al. (2010), we considered the evidence for one model over another to be significant when the log evidence exceeded three. This corresponds to a Bayes factor or evidence ratio of exp(3) = 20, or a p value of 0.05. Finally, posterior densities were estimated over models to report the winning model in terms of the one with the highest posterior probability. The parameters of this winning model were averaged over subjects using a Bayesian model averaging procedure (Stephan et al., 2010).
Results
Activations of proposed motivational systems
Figure 2 presents an overlay map of activations elicited by our three predetermined independent whole-brain contrasts of interest. As expected, each contrast elicited a differential pattern of distributed brain activations (Table 1). Importantly for our a priori brain hypothesis, the amygdala, hypothalamus, and sgACC responded only to punishment (Fig. 2, red), while the NAcc and dmPFC responded only to reward (Fig. 2, green), and the hippocampus and vmPFC responded during goal-conflict only (Fig. 2, blue). Notably, although our regions of interest were theoretically predefined by the RST model, these regions also functionally emerged independently from our whole-brain analysis. These results may represent initial evidence that brain responses to each motivational state indeed correspond to the motivational neural systems proposed by the RST model.
Whole-brain analysis. Each of our three contrasts of interest elicited a differential pattern of distributed brain activations that highly corresponded to the motivational neural systems as proposed by the RST model. During response to punishment, activations were observed in the amygdala (1), hypothalamus (2), and sgACC (3), corresponding to the FFFS (red). During response to reward, activations were observed in the NAcc (4) and dmPFC (5), corresponding to BAS (green). Finally, during periods of goal-conflict, activations were observed in the hippocampus (6) and vmPFC (7), corresponding to BIS (blue). n = 24.
Peak of activations elicited by our three predetermined independent whole-brain contrasts of interest (i.e., motivational states)
Modeling subgraphs of proposed motivational systems
The brain regions of each motivational system that are relevant according to the RST model, and that increased their activations in our whole-brain contrasts, were aggregated into three specific supgraphs. Figure 3 shows the winning models according to Bayesian model selection (Fig. 3A). As expected, for each subgraph, the winning model was the one modulated by its assigned RST motivational state. Specifically, for the FFFS, the coupling between the amygdala, hypothalamus, and sgACC was selectively modulated during punishment but not during reward or conflict states. Punishment had a significant input to the amygdala and hypothalamus and also modulated their effective connectivity (Fig. 3B.I). Similarly, for the BAS, the effective connectivity between the NAcc and dmPFC was selectively modulated while receiving reward, and not while receiving punishment or during goal-conflict. The reward state drove the NAcc and was the only one to modulate these connections (Fig. 3B.II). Finally, for the BIS, the effective connectivity between the hippocampus and vmPFC was selectively modulated when participants were facing a goal-conflict state but not during punishing or rewarding outcomes. Goal-conflict state showed significant input to the hippocampus and modulated the connections between these regions (Fig. 3B.III).
DCM results. A, Model comparison. Each comparison refers to one subgraph, consisting of four models with the same architecture (i.e., effective and modulatory connections), defined by different motivational states as inputs: (1) common effects, (2) conflict, (3) punishment, and (4) reward. B, Winning models. I, Effective connections were found to be significant among amygdala, hypothalamus, and sgACC during punishment interval, all suspected to be part of the FFFS. Punishment state had significant input to the amygdala and hypothalamus and modulatory effects on all connections between regions. II, Effective connections were found to be significant between NAcc and dmPFC under reward state, as proposed for BAS. Reward state had significant input to NAcc and modulatory effects on the NAcc–dmPFC connection. III, Effective connections were significant between the hippocampus and vmPFC during goal-conflict, as proposed for BIS. Goal-conflict state had significant input to the hippocampus and modulatory effects on all connections between regions. Tables display averaged parameter estimates (in Hz). n = 24.
Discussion
By manipulating both aversive and appetitive drives, as well as the process of choosing between them under goal-conflict, we were able to probe the neural substrate of distinct motivational processes in humans. Further, simultaneous manipulation of all these states in an unpredictable manner better simulated a real-life situation. Finally, by using a model-driven analysis, we were able to test in humans a well established neural theory of motivation whose construction was based on extensive animal findings. Together, our results display a remarkable correspondence to the proposed RST motivational networks, while highlighting specific regions as core elements in the operation of the FFFS, BAS, and BIS. Furthermore, DCM analyses confirmed that connectivity strength between neural components of each motivational system increased under the relevant state. As defined by Friston (2011), the effective connectivity analysis performed by DCM allows inferences to be made regarding the influence that one neuronal component exerts on another in the network under a specific experimental state. Our findings thus imply that the neural operation of each motivational system in the appropriate context is associated with increased effective connectivity between separate components, a mechanism that may allow the coexistence of different motivational systems within the brain. This is somewhat similar to other suggested neural networks that were shown to increase functional coupling when their function was called on, for example, in the case of the hippocampus–vmPFC circuit in relation to fear extinction (Milad et al., 2007), as well as in the difference in neural coupling during emotion regulation, where downregulation versus upregulation of negative feelings recruited different neural networks (Ochsner et al., 2004).
Few studies have examined the neural networks that underlie motivational processes in humans. For example, Camara et al. (2008) investigated functional connectivity during gains and losses in a gambling task, using the NAcc as a seed region for correlation detection. They found similar networks for both states, including the orbitofrontal cortex, amygdala, insular cortex, and the hippocampus. However, such connectivity analyses could not account for a temporal or causal relationship between regions. DCM analyses have been previously used for investigations of complex processes such as retrieval of emotional memories (Smith et al., 2006) or the process of affective prosody (Ethofer et al., 2006). More relevant to our case, Alexander and Brown (2010) applied DCM analysis to investigate the role of the anterior cingulate cortex (ACC) in signaling perception of risk and predicted reward. They found that computation of reward magnitude and error likelihood is independent (i.e., one is not modulated by the other) and that this computation is intrinsic to the ACC and not received from elsewhere. The authors attribute these findings to the role of the ACC in evaluation of risk versus rewards, a function that the RST relates to BIS. Nonetheless, since their DCM models did not include any of the RST-proposed BIS components, and given the RST prediction that the BIS receives information from the ACC during goal conflict, these results may represent one operational node of the BIS network. Notably, this highlights the importance of comprehensive model hypothesis when applying DCM analysis.
To the best of our knowledge, our study is the first comprehensive investigation of the neural networks that underlie motivational processes in humans using sophisticated causal modeling. It should be noted that our whole-brain analysis results were obtained at a relatively liberal (uncorrected) statistical threshold and thus further investigation of motivational networks in humans is still needed to validate our findings. However, given that our study was driven by a model with a clear a priori hypothesis, the whole-brain analyses performed here were not crucial to the identification of the critical nodes of each motivational network but rather aimed to explore other regions involved in the networks. Indeed, some regions within the distributed activations that were elicited in our whole brain analyses are not mentioned in the RST (e.g., the insula and dorsolateral prefrontal cortex in response to reward and goal-conflict states, respectively). On the contrary, some brain regions that were assigned by the RST to the motivational networks were not revealed in our whole-brain analyses (e.g., the VTA and septal area in response to reward and goal-conflict states, respectively). This signifies that the RST neural model is a simplification of the complex response to motivational signals, especially in humans, which probably involves multiple interacting brain regions. Thus, the subgraphs analyzed in this paper are clearly part of larger networks. This, however, does not represent a problem for our interpretation in the sense that effective connectivity in dynamic causal modeling can be polysynaptic; in other words, it can be mediated vicariously by regions not included in the model. Finally, it should also be noted that individual differences in the sensitivity to different reinforcements constitute an integral part of the RST that is not addressed in this study. Future investigations should include appropriate psychometric measurements of motivational traits to validate the suggested association between neural activation and RST motivational systems at the individual level.
Despite certain limitations, the remarkable resemblance between our findings in humans and the animal-based RST model strengthens the validity of psychological findings in animal research and the feasibility of its translation to human processes. Furthermore, considering the fact that dysfunctional motivational processes have been implicated in several psychopathologies such as anxiety (McNaughton and Corr, 2008), mood (Depue and Iacono, 1989; Johnson, 2005) and personality disorders (Pastor et al., 2007; Völlm et al., 2007), further exploration of animal models of motivational processes and their parallels in the human brain may promote our ability to understand the neural impairments underlying such pathologies.
Footnotes
This work was supported by the Israeli Ministry of Science and Sport (T.H.), the Israeli Defense Forces Medical Corps (T.H.), the Levy Edersheim Gitter Institute for Neuroimaging (T.G. and R.A.), and the Adams Super Center for Brain Studies, Tel Aviv University (R.A. and T.H.).
The authors declare no financial conflicts of interests.
- Correspondence should be addressed to Talma Hendler, the Functional Brain Center, Wohl Institute for Advanced Imaging, Tel Aviv Sourasky Medical Center, Weizmann 6, Tel-Aviv 64239, Israel. talma{at}tasmc.health.gov.il