Animals constantly adjust their actions based on new experiences. Thus, rewarding experiences and non-rewarding or aversive experiences are integrated to guide behaviors. A central neural system for coding of rewards and reward prediction is the dopaminergic system (Rescorla, 1988; Watabe-Uchida et al., 2017). Dopaminergic neurons in the basal ganglia receive inputs from a wide range of brain regions (Beier et al., 2015), including the lateral habenula (LHb). The LHb is an epithalamic nucleus that acts as a relay station and integration hub for reward signals. It receives inputs from the limbic system and the basal ganglia, and it influences both the dopaminergic system, which controls motivation, and the serotonergic system, which controls mood (Proulx et al., 2014). LHb neurons spike when animals experience events with negative valence, such as aversive stimuli or the absence of an expected reward (Matsumoto and Hikosaka, 2007, 2009). LHb neurons can activate dopaminergic neurons in the ventral tegmental area (VTA; Hikosaka, 2010). However, they can also inhibit dopaminergic neurons via the GABAergic rostromedial tegmental nucleus (RMTg). This disynaptic connection is thought to transmit information about negative-reward prediction from LHb to dopaminergic neurons (Matsumoto and Hikosaka, 2007, 2009). Thus, when LHb neurons are excited, they can suppress the activity of dopaminergic neurons, and lead to the reduction of motor behaviors (Hikosaka, 2010). However, two questions remain unanswered: whether stimuli indicating the absence of one reward drive animals to shift their action toward obtaining a different reward (rather than just suppressing the non-rewarded action), and, if that is true, whether the LHb is required for the shift.
In a recent article in The Journal of Neuroscience, Laurent et al. (2017) investigated the neural basis of such behaviors using a so-called Pavlovian-instrumental transfer paradigm in rats. First, rats were taught to associate two different sounds (S1 or S2, a click and a tone), each with a particular reward outcome (O1 or O2, a food pellet or sucrose solution). Next, rats were trained to press a lever (A1) to receive O1, and to press another lever (A2) to receive O2. Finally, the two tasks were combined to test whether presentation of sound S1 would bias rats toward pressing A1 over A2. Thus, in this Pavlovian-instrumental transfer, the presentation of a conditioned stimulus (CS; here the sound S1) will increase the action A1 that retrieves the reward O1 predicted by that CS.
Laurent et al. (2017) then explored what happens when a CS predicts the absence rather than the availability of a specific reward. To generate such negative reward predictions, the CS was presented right after the delivery of a reward, in so-called backward Pavlovian training (Laurent and Balleine, 2015). They then asked whether presentation of the CS that signals absence of a reward would simply be ineffective for producing an action (for example, reducing pressing of lever A1) or whether it would bias rats toward the alternative action (e.g., increasing pressing of lever A2). The authors found that negative-reward prediction of one action did, in fact, shift behaviors toward the alternative action. Importantly, rats with lesions of LHb did not show any preference for one action over the other, indicating that the shift in action-selection based on negative-reward prediction required the LHb. Interestingly, LHb lesions produced no changes in action-selection based on positive rewards, indicating that LHb neurons are specifically required for signaling of negative reward prediction.
What might be the neural circuit mechanisms underlying these observations? The LHb contains neuron populations that project to several down-stream targets. Laurent et al. (2017) found that selectively ablating the LHb neuron population that innervates the RMTg replicated the negative reward prediction phenotype seen in LHb-lesioned rats (Laurent et al., 2017). Thus, LHb outputs to the RMTg are critically important to relay outcome-specific inhibition downstream of the LHb.
Previous studies provide some insights into the upstream nuclei that may relay negative reward signals to the LHb. The globus pallidus signals negative motivational value (Hong and Hikosaka, 2008), and LHb neurons transmit such signals to VTA and RMTg neurons (Jhou et al., 2009; Stamatakis and Stuber, 2012). Considering this circuit arrangement, there are multiple sites where the Pavlovian training may trigger plasticity to accommodate the learning process. Backward Pavlovian training with negative reward as the unconditioned stimulus might first produce plasticity in globus pallidus. However, previous studies also provided evidence that the efficacy of synaptic transmission in LHb neurons can be modified during learning (Meye et al., 2013). In aversive Pavlovian learning, LHb neurons increase firing in response to conditioned stimuli as training proceeds, and this potentiation remains for 24 h during extinction sessions (Wang et al., 2017). Another study showed that exposure to stressors can facilitate the induction of long-term potentiation (LTP) in LHb neurons (Park et al., 2017). Therefore, backward Pavlovian training might modify the synaptic inputs to LHb neurons, and make the LHbs the storage site for the negative reward prediction.
The findings that ablating RMTg-projecting LHb neurons produce a similar phenotype as LHb lesions in Pavlovian-instrumental transfer paradigm is intriguing. Thus, the LHb neurons relevant for this behavioral paradigm inhibit the activity of VTA dopaminergic neurons via the GABAergic neurons in the RMTg. Downstream, VTA dopaminergic neurons project to the dorsomedial striatum, an area implicated in goal-directed instrumental conditioning (Balleine et al., 2009; Hikosaka, 2010). Therefore, during Pavlovian-instrumental transfer, the signals of negative-reward prediction transmitted from LHbs to RMTg are likely further relayed to the dorsomedial striatum. It remains to be seen whether there are indeed neuronal populations in the dorsomedial striatum that encode negative-reward prediction during Pavlovian training or when rats are facing the instrumental choice. Such a dissection will require time-resolved monitoring and manipulations with optogenetic tools.
All together, the study by Laurent et al. (2017) demonstrated the critical function of LHb neurons, in using cues for the lack of reward to instruct behaviors for an outcome-specific choice. Because abnormal activity of LHb neurons contributes to depression in both humans and depression-related behaviors in rodents (Sartorius et al., 2010; Proulx et al., 2014; Cui et al., 2018), this study might help us to better understand how alterations in the LHb may result in modified emotional states. Thus, altered function of LHb neurons would result in the inability to incorporate into future decisions the lessons learned from the disappointment of failure in the past.
Footnotes
Editor's Note: These short reviews of recent JNeurosci articles, written exclusively by students or postdoctoral fellows, summarize the important findings of the paper and provide additional insight and commentary. If the authors of the highlighted article have written a response to the Journal Club, the response can be found by viewing the Journal Club at www.jneurosci.org. For more information on the format, review process, and purpose of Journal Club articles, please see http://jneurosci.org/content/preparing-manuscript#journalclub.
I thank Lisa Traunmüller, Dr. Oriane Mauger, and Dr. Peter Scheiffele for comments on the paper.
The author declares no competing financial interests.
- Correspondence should be addressed to Dr. Le Xiao, Biozentrum, University of Basel, Klingelbergstrasse 50-70, 4056 Basel, Switzerland. le.xiao{at}unibas.ch