Every day, we encounter situations in which we must decide whether to continue what we are doing or move on to a potentially better option (e.g., going to the same place as last year for vacation or traveling somewhere new; staying in a secure but unsatisfying job or embarking on a new career path). These types of decisions are examples of the explore/exploit problem. Exploration is defined as choosing an option about which we have less information, whereas exploitation is sticking with an option about which we know more. The decisions people ultimately make in these situations depend upon their past experiences, which create expectations about the type and likelihood of rewards and punishments that a given environment will yield. For example, if an individual perceives that the world has been harsh and unfair, she might expect that any choice she makes will lead to an undesirable outcome. This individual may therefore be more likely to exploit a familiar, but nonoptimal, option than to explore an alternative path. Lenow et al. (2017) argue that stress is one factor that leads to perceptions of an environment being harsh, and therefore hypothesized that stress would facilitate tendencies to exploit rather than explore.
To systematically examine the effects of stress on explore/exploit behavior, Lenow et al. (2017) used a virtual patch-foraging task in which participants spent time in each of four orchards, with the goal of harvesting as many apples as possible. On each trial, participants had the option to stay at the current tree/patch or move to a different tree. Each subsequent harvest of the same tree resulted in slightly fewer apples, so at some point it would be advantageous to move on to the next tree. In orchards representing rich environments, travel time to the next tree was short; and in orchards representing harsh environments, travel time was longer. An experimental group completely submerged their arm in cold water to induce acute stress before the foraging task, whereas a control group submerged their arm in warm water. Cortisol responses to the stressor were used as a continuous measure of acute stress. Participants also reported their perceived chronic stress.
Explore/exploit behavior was measured in terms of each participant's tree-level exit threshold (the average of the last two rewards before moving to the next tree). As expected, participants had higher exit thresholds in rich orchards than in low-quality orchards, showing that they used environment quality to guide their decisions and recognized that the opportunity cost of moving between trees was higher in the low-quality environment. In both environments, participants who showed higher cortisol responses and who reported higher chronic stress showed lower exit thresholds, indicating greater exploitation, than less stressed participants. To further explain results, exit thresholds were compared against the optimal threshold for leaving a tree that would maximize one's reward. Deviations from this optimal value were calculated for each subject and characterized in terms of underexploitation and overexploitation. Both acute and chronic stress were associated with more overexploitation.
Based on these results, the authors suggest that stress leads to overexploitation through biased perceptions of environmental quality (i.e., stressed individuals perceive the environment to be harsher than it actually is). There are other potential mechanisms that could influence behavior on foraging tasks, however. One possibility is that stress biases individuals toward inaction rather than action. A recent study found that stress specifically impaired learning to produce an action, regardless of whether participants needed to act to gain a reward or to avoid a punishment (de Berker et al., 2016). If moving between orchards represents an action, a general bias toward inaction could explain stressed participants' tendency to stay at the same patch rather than move to a new patch. Stress has also been shown to shift an organism's attention and cognitive focus to the present over the future (Frankenhuis et al., 2016). A present-oriented bias could decrease exploration during a foraging task if it interferes with one's ability to conceptualize the potential reward that could be gained in a future patch.
Another potential mechanism through which stress may influence explore/exploit behavior is by reducing cognitive flexibility, which is facilitated by the prefrontal cortex (Kim et al., 2011). Optimal exploration requires cognitive flexibility because individuals must update representations about their environment (in this case, depletion rate of the tree and travel time) when deciding when to move on to a new patch. In contrast, remaining at the same tree for longer than is optimal could be a form of perseveration, which does not require prefrontal function and is associated with stress (Schwabe and Wolf, 2009). The prefrontal cortex is highly sensitive to stress. Acute stress leads to the release of glucocorticoids, which appear to reduce the function of prefrontal cortex by disrupting intracellular signaling pathways (Arnsten, 2009). Chronic stress is also associated with reduced prefrontal function because chronically high levels of glucocorticoids appear to cause dendritic retraction and reduced spine number in this region (Joëls et al., 2007; Dias-Ferreira et al., 2009). At the same time, acute and chronic stressors appear to increase amygdala and striatal control over prefrontal cortex (Hermans et al., 2011; Fareri and Tottenham, 2016), facilitating habit-directed learning and perseveration (Schwabe and Wolf, 2009). In sum, both acute and chronic stressors appear to lead to impaired prefrontal function and increased reliance on striatal and limbic structures to guide decision-making. This altered brain function reduces cognitive flexibility and increases perseveration, potentially resulting in higher levels of exploitation.
The findings of Lenow et al. (2017) parallel those of a study that examined effects of early life stress on exploration and exploitation. Humphreys et al. (2015) compared adolescents who had been institutionalized as infants with adolescents who had no such history during a reward task in which each pump of a balloon could lead to either accumulating more points or losing all one's points. Previously institutionalized adolescents “cashed in” their earnings earlier than the comparison group, reflecting a tendency to exploit a safe option rather than explore the possibility of gaining more points. In addition, maternal separation in infancy has been associated with less physical exploration in adolescent rats (Spivey et al., 2008). These results are consistent with the notion that stress exposure reduces exploration.
The notion that stress interferes with an organism's tendency to explore different options has important implications for learning processes throughout the lifespan. Individuals learn in part through sampling information in their environment: in situations where the probabilities of various outcomes are unknown, individuals must explore different options to learn action-outcome associations (Sheth et al., 2011; Hertwig and Frey, 2017). If new information is sampled at a lower rate due to stress, then learning may be diminished. Indeed, recent studies show reduced associative learning ability in adolescents who were exposed to early childhood stress (Hanson et al., 2017; Harms et al., 2017). Results of Lenow et al. (2017) and Humphreys et al. (2015) suggest that this phenomenon could be partially explained by reduced exploration and information sampling due to stress.
In future research addressing the effects of stress on motivated behavior, it will be important to consider potential relationships between exploration/exploitation and reward processing, as well as their neural substrates. These processes rely on overlapping brain circuitry, including the striatum and prefrontal cortex (Daw et al., 2006). Stress exposure profoundly affects the structure and connectivity of prefrontal and striatal regions (Dias-Ferreira et al., 2009; Fareri and Tottenham, 2016), and severe early life stress has been linked to reduced striatal reward responsivity (Dillon et al., 2009; Goff et al., 2013). Furthermore, both striatum and prefrontal cortex are components of dopaminergic reward pathways. There is evidence that stress exposure alters dopamine function, although effects of stress on dopamine systems appear to be complex and may vary by type of stressor (Hollon et al., 2015). More research is needed to examine more systematically how different types of stressors influence dopamine systems, and how dopamine in turn regulates exploration and exploitation. This line of research could inform treatment for disorders in which dopamine systems may be disrupted, such as depression (Tye et al., 2013).
Although the negative aspects of stress are often emphasized, physical and behavioral responses to stressors evolved to promote survival of the organism. Stress tends to be associated with harsh environments, in which exploration is less likely to be associated with reward. Maladaptive effects of stress occur when there is a mismatch between the stress response and the current environment (e.g., an individual behaves as if he is in a harsh environment when he is in a rich environment). In Lenow et al. (2017), stress from the cold-water submersion and from daily life influenced decision-making during an unrelated foraging task, leading to overexploitation and reduced performance. More consequentially, neural and behavioral effects of early life stress appear to persist throughout the lifespan, potentially leading to alterations in decision-making that hinder learning. The findings of Lenow et al. (2017) reiterate the profound effects of stress on decision-making, but there is still much to learn about how specific types and timing of stress affect different aspects of motivated behavior, such as reward valuation, goal representations, and expectations about the future.
Footnotes
Editor's Note: These short reviews of recent JNeurosci articles, written exclusively by students or postdoctoral fellows, summarize the important findings of the paper and provide additional insight and commentary. If the authors of the highlighted article have written a response to the Journal Club, the response can be found by viewing the Journal Club at www.jneurosci.org. For more information on the format, review process, and purpose of Journal Club articles, please see http://jneurosci.org/content/preparing-manuscript#journalclub.
I thank Seth Pollak for comments on the manuscript. M.H. is funded by National Institute of Mental Health grant T32MH018931-28.
The author declares no competing financial interests.
- Correspondence should be addressed to Dr. Madeline B. Harms, Department of Psychology, University of Wisconsin–Madison, 1202 West Johnson Street, Madison, WI 53706. mharms3{at}wisc.edu