Differential Neuromodulation of Acquisition and Retrieval of Avoidance Learning by the Lateral Habenula and Ventral Tegmental Area

Several studies suggest an opponent functional relationship between the lateral habenula (LHb) and the ventral tegmental area (VTA). Previous work has linked LHb activation to the inhibition of dopaminergic neurons during loss of reward, as well as to deficits in escape and avoidance learning. We hypothesized that a dopamine signal might underlie the negative reinforcement of avoidance responses and that LHb activation could block this signal and thereby cause avoidance deficits. To test this idea, we implanted stimulating electrodes in either the VTA or LHb of gerbils engaged in two-way active avoidance learning, a task that shows learning-associated dopamine changes and that is acquired faster following LHb lesions. We delivered brief electrical brain stimulation whenever the animal performed a correct response, i.e., when the successful avoidance of foot shock was hypothesized to trigger an intrinsic reward signal. During the acquisition phase, VTA stimulation improved avoidance performance, while LHb stimulation impaired it. VTA stimulation appeared to improve both acquisition and asymptotic performance of the avoidance response, as VTA-stimulated animals reached above-normal performance but reverted to normal responding when stimulation was discontinued. The effects of LHb stimulation during avoidance acquisition were long lasting and persisted even after stimulation was discontinued. However, when given after successful acquisition of avoidance behavior, LHb stimulation had no effect, indicating that LHb stimulation specifically impaired avoidance acquisition without affecting memory retrieval or motivation or ability to perform the avoidance response. These results demonstrate opponent roles of LHb and VTA during acquisition but not during retrieval of avoidance learning.


Introduction
The ventral tegmental area (VTA) responds to unexpected rewards with phasic increases in dopaminergic firing, and to unexpected omissions of reward with brief cessations in firing (Ljungberg et al., 1992;Schultz et al., 1993). While it is well known that this dopamine signal contributes to reward learning, it is less well known that it also contributes to avoidance learning (McCullough et al., 1993). According to the two-process theory of avoidance (Dinsmoor, 2001), an animal first learns that a conditioned stimulus (CS), such as a tone, will be followed by an aversive unconditioned stimulus (US), such as a shock. When the animal subsequently performs the correct avoidance response, this expectation of aversive outcome is violated and, in theory, should activate the dopamine reward system.
The strongest evidence supporting this idea comes from a series of microdialysis experiments showing increased cortical dopamine release in gerbils during the early stages of avoidance learning, coupled to the first trials when an animal successfully performs the avoidance response (Stark et al., 1999(Stark et al., , 2000(Stark et al., , 2001(Stark et al., , 2004. In addition, increases in accumbal dopamine release are highly correlated with increases in avoidance responding (McCullough et al., 1993). More supporting evidence comes from a study of rats which were permitted to avoid shock versus rats which were not. The majority of brain differences were found in the mesocorticolimbic dopamine system, which was more active in rats that acquired the avoidance response (Coco and Weiss, 2005). Conversely, rats bred for deficits in escape learning show reduced metabolic activity throughout the mesocorticolimbic dopamine system while showing elevated habenula activity (Shumake et al., 2003).
The latter finding is interesting because the lateral habenula (LHb), an epithalamic structure, reciprocally connects with the VTA (Gruber et al., 2007;Geisler and Trimble, 2008) and inhibits dopaminergic neurons (Christoph et al., 1986;Ji and Shepard, 2007;Matsumoto and Hikosaka, 2007). In particular, LHb neurons are excited by either the absence of reward or the presence of punishment (Matsumoto and Hikosaka, 2009) and appear responsible for the dopamine suppression that occurs in the absence of expected rewards (Matsumoto and Hikosaka, 2007). We hypothesized that the same LHb-mediated dopamine suppression may, under some circumstances, interfere with the negative reinforcement of avoidance responses.
To test this idea, we implanted stimulating electrodes in either the VTA or LHb of gerbils engaged in two-way active avoidance learning, a task that shows learning-associated dopamine changes (Stark et al., 1999(Stark et al., , 2000(Stark et al., , 2001(Stark et al., , 2004 and that is acquired faster following LHb lesions (Wilson et al., 1972). In addition to Stark et al. (1999Stark et al. ( , 2000Stark et al. ( , 2001Stark et al. ( , 2004, several other studies from our laboratory have demonstrated the suitability of gerbils for the investigation of learning mechanisms (Scheich et al., 1993;Ohl et al., 2001;Wetzel et al., 2008). We delivered brief electrical brain stimulation whenever a gerbil performed a correct response, i.e., when the successful avoidance of foot shock is proposed to trigger an intrinsic reward signal. We hypothesized that VTA stimulation might enhance this signal and thereby improve learning, whereas LHb stimulation might block this signal and thereby impair learning.

Materials and Methods
Subjects. Adult (3-6 months old, 80 -105 g) male Mongolian gerbils (Meriones unguiculatus) were obtained from Tumblebrook Farms. Gerbils were housed individually starting 3 d before surgery and maintained on a 12 h light/dark cycle (lights on 7:00 A.M. to 7:00 P.M.) throughout the experiment. All experimental procedures were approved by the Ethics Committee of the State of Sachsen-Anhalt, Germany.
Surgical procedures. Surgery and implantation of electrodes were performed under ketamine (100 mg/kg) and xylazine (5 mg/kg) anesthesia. Animals were fixed in a stereotaxic frame (David Kopf Instruments). Bipolar stimulation electrodes with the tips separated by ϳ0.2 mm were custom made from Teflon-insulated stainless steel microwires (diameter: 140 m; Science Products) and implanted at the level of the LHb (1.6 mm posterior, 0.6 mm lateral, 2.6 mm ventral to bregma) or VTA (2.6 mm posterior, 1.3 mm lateral, 5.0 mm ventral to bregma) according to a stereotaxic atlas for gerbils (Loskota et al., 1974). Due to the relatively long anterior-posterior extent of the habenula region compared to its dorsalventral and medial-lateral dimensions, bipolar stimulation electrodes were placed such that the two electrode tips were aligned along the rostral-caudal axis. The electrode was fixed in place with dental acrylic cement. Half of the subjects were implanted in the left hemisphere and half in the right. Training was initiated following a 4 d recovery period.
Shuttle box procedure and experimental design. Gerbils were trained in a shuttle box (38 cm ϫ 19 cm ϫ 22.5 cm) that had two compartments separated by a 6 cm high hurdle. Each daily session consisted of 60 trials with a variable intertrial interval of 20 -24 s. A session began with a 3 min habituation period. The CS was a series of 2 kHz pure tones (6 s of 200 ms tones separated by 300 ms). With the termination of the CS, a 4 s footshock US was applied through the grid floor. The intensity of footshock was slowly raised from 0.4 to 0.6 mA during the first training session and kept at 0.6 mA for subsequent sessions. A custom written shuttle box program delivered the electrical brain stimulation (EBS) using an isolated pulse stimulator (Model 2100, A-M Systems). Twenty biphasic pulses (100 Hz, 100 s pulse width, 9.8 ms interpulse interval, 200 ms train length) were applied to either the LHb (100 A) or the VTA (100 -160 A) immediately following each successful avoidance response (crossing the hurdle during the CS). A flexible cable connected with a swivel allowed the electrical brain stimulation and easy movement during shuttle-box learning.
There were two phases of training: an acquisition phase (sessions 1-5) and a postacquisition phase (sessions 6 -10). Half of the animals received brain stimulation during the acquisition phase but not during the postacquisition phase, allowing us to assess the reversibility of stimulation effects. The other half served as sham controls for the acquisition phase and did not receive stimulation. In addition, the subset of control animals with LHb implants that did not receive stimulation during the acquisition phase was given stimulation during the postacquisition phase. This was done to assess whether inhibitory effects of LHb stimulation observed during acquisition were due to interference in learning versus a more direct suppression of the avoidance response.
Verification of electrode position and data analysis. To verify that VTAimplanted electrodes were positioned correctly, we tested whether the animal would self-stimulate these electrodes. After the final session of shuttle-box testing, the animal was placed in a custom made operant chamber (18 cm ϫ 18 cm ϫ 23 cm) with a metal lever in the lower right corner. When the lever was pressed, a 200 ms train of 20 biphasic pulses of 0.2 ms duration was delivered at a frequency of 100 Hz. Only animals that achieved high rates of operant responding within 2-5 d were considered to have proper VTA placements and were included in subsequent data analyses.
After the end of the experiments, the gerbils were anesthetized with ketamine-xylazine, and their brains were rapidly isolated, frozen in liquid-nitrogen-chilled isopentane, and finally stored at Ϫ20°C. Coronal sections (40 m) were made using a sliding microtome (Leica cryostat), and Nissl combined with Prussian-blue staining was performed to reveal the ion deposits around the electrode tips (Fig. 1). Only subjects with ion deposits within the boundaries of the LHb were included in the data analyses ( Fig. 2), while inclusion of the VTA subjects was based on the self-stimulation criterion as discussed above. The final animal numbers for the stimulation groups during phase 1 of training were VTA stimulated (n ϭ 7), VTA unstimulated (n ϭ 6), LHb stimulated (n ϭ 8), and LHb unstimulated (n ϭ 8). Stimulation conditions were reversed in phase 2 of training. The success rate (number of successful avoidance responses divided by total number of trials), average latency to initiate response (with failed trials assigned a latency score of 6 s, the maximum length of the CS), escape rate (number of successful escape responses divided by total number of trials), and intertrial crosses (spontaneous hurdle crosses during the intertrial interval) were calculated for each session. Repeated-measures ANOVAs, analyses of covariance (ANCOVAs), t tests, and Dunnett's post hoc tests were used to analyze the data as explained in the next section.

Figure 1.
Photographs of brain sections stained with cresyl violet and Prussian blue from an animal implanted with an electrode in the lateral habenula nucleus (top panels) and an animal implanted with an electrode in the ventral tegmental area (bottom panels). The blue reaction product indicates the site of electrical stimulation. MHb, Medial habenula; RN, red nucleus; SNc, substantia nigra pars compacta; SNr, substantia nigra pars reticulata; IP, interpeduncular nucleus.

Effects of LHb and VTA stimulation during avoidance acquisition
Because the VTA and LHb nonstimulated groups were not significantly different from each other in terms of overall mean successful trials (41 vs 39) or mean response latencies (4.0 s vs 4.1 s), these two groups were pooled together into one nonstimulated control to simplify data analysis and presentation. The avoidance data (both successful trials and response latencies) were analyzed with a 3 ϫ 5 (stimulation condition ϫ training session) repeated-measures ANOVA, with session serving as the repeated measure. Both ANOVAs of successful trials and response latencies revealed significant main effects of stimulation (F (2,26) ϭ 14.5 and 14.7, respectively, p Ͻ 0.001) and session (F (4,104) ϭ 59.8 and 46.9, respectively, p Ͻ 0.001) with no significant interaction (F (8,104) ϭ 1.17 and 1.85, p ϭ 0.32 and 0.08, respectively).
The effects of VTA and LHb stimulation were approximately equal and opposite (Fig. 3). Dunnett's post hoc tests confirmed that VTA-stimulated animals had significantly more successful trials ( p ϭ 0.04) and shorter response latencies ( p ϭ 0.009) than nonstimulated controls, while LHb-stimulated animals had significantly fewer successful trials ( p ϭ 0.003) and longer response latencies ( p ϭ 0.01). Broadly speaking, VTA stimulation halved the amount of training needed to reach 50% avoidance, and LHb stimulation doubled it: the VTA-stimulated group achieved after one training session what the control group achieved after two sessions and what the LHb-stimulated group achieved after four.

Effects of LHb and VTA stimulation on speed of acquisition
The interaction term in the above ANOVA assesses differences in the slopes of the learning curves across groups. It could therefore be used to assess group differences in the rate of avoidance acquisition, provided that the learning curves were nonasymptotic. However, since the curves were asymptotic, it is inappropriate to interpret the interaction term in this way. This is because as a subject approaches its learning asymptote, its rate of learning necessarily slows down. An experimental manipulation might accelerate learning initially, but it cannot do so indefinitely. Thus, to appropriately evaluate whether brain stimulation evoked different learning rates, the interaction analysis must be applied before a group approaches its asymptote. Since the VTA-stimulated group already achieved near-asymptotic performance by session 2, it is impossible to evaluate differences in learning rate from a stimu-  Fig. 1). Continuous black lines represent subjects that were stimulated during acquisition (sessions 1-5), and dotted black lines represent subjects that were stimulated after acquisition (sessions 6 -10). White and light gray lines represent subjects that were anatomical controls stimulated in regions adjacent to the LHb (light gray for hippocampus, white for thalamus) or VTA (white) during acquisition. lation ϫ session interaction. Therefore, we conducted an additional repeated-measures ANOVA of avoidance learning within the first session, using a 3 ϫ 6 design (stimulation condition ϫ training trials), with trials (averaged into 6 blocks of 10 trials) serving as the repeated measure.
This ANOVA revealed a significant interaction for both avoidance responses and latencies (F (10,130) ϭ 2.32 and 3.21, p ϭ 0.015 and 0.001, respectively), indicating significantly different rates of acquisition between the stimulation groups. Figure 4 shows that the VTA-stimulated group exhibited an accelerated learning curve, while the learning curve of the LHb-stimulated group was virtually flat. However, this stimulation ϫ trials interaction is driven by the VTA-stimulated group. Decelerated learning from LHb stimulation was not readily apparent in the first session, but rather emerged across sessions. This conclusion is supported by the fact that the LHb-stimulated group was not significantly different from unstimulated controls in session 1 ( p Ͼ 0.50) but became significantly different in session 2 ( p Ͻ 0.05) and highly significantly different in session 3 ( p Ͻ 0.01). Moreover, if the repeated-measures ANOVA is restricted to comparing the LHb-stimulated and nonstimulated groups across the first three sessions (before their learning curves reach asymptote), there is a significant interaction for the more parametric measure of response latency (F (2,40) ϭ 3.49, p ϭ 0.04), although not for the more discrete measure of successful trials (F (2,40) ϭ 2.01, p ϭ 0.15).
In summary, VTA stimulation caused a rapid acceleration of avoidance acquisition. This group established a superior level of avoidance performance by the end of the first training session, which was followed by slower gains throughout the remaining training sessions as performance approached asymptote. On the other hand, LHb stimulation caused a subtle slowing of acquisition, with its effects accumulating gradually across days.

Anatomical specificity of stimulation effects
To confirm that the stimulation effects reported above were mediated by the LHb and VTA, and not by current spreading to surrounding regions, we repeated the above analysis using subjects with electrode placements in adjacent regions (Fig. 2). This consisted of a group of 4 subjects with electrodes in the adjacent hippocampal region, a group of 7 subjects with electrodes in adjacent thalamic nuclei, and a group of 7 subjects with electrodes in the vicinity of the VTA but that did not support selfstimulation after the conclusion of avoidance training. The same nonstimulated controls were used as before. Neither ANOVAs of successful trials nor response latencies revealed significant effects of stimulation (F (3,28) ϭ 0.88 and 0.31, p ϭ 0.46 and 0.82, respectively). Dunnett's post hoc tests confirmed that neither stimulation of VTA-adjacent electrodes ( p ϭ 0.55 for avoidance and 0.75 for latency) nor stimulation of hippocampal electrodes ( p ϭ 0.99 for avoidance and 0.99 for latency) nor stimulation of thalamic electrodes ( p ϭ 0.90 for avoidance and 0.93 for latency) caused a significant change in behavior from that of nonstimulated controls. This analysis indicates that electrodes must be placed within the LHb or VTA to influence avoidance learning.

Functional specificity of stimulation effects
Since manipulations of dopaminergic activity can impact motivation and general motor activity, it is important to assess whether our stimulation parameters influenced these variables and, if they did, to what extent motivation and activity can account for the observed effects on avoidance behavior. For example, an animal might fail to learn because it is not motivated to escape the foot shock, or a hyperactive animal might continually cross back and forth in the shuttle box and thereby achieve a high rate of avoidance success without actually learning the avoidance contingency.
To verify that all animals were sufficiently motivated by the foot shock, we assessed escape responses (crossing the hurdle after the onset of foot shock) using a 3 ϫ 5 (stimulation condition ϫ training session) repeated-measures ANOVA, with session serving as the repeated measure. The ANOVA revealed a significant main effect of session, indicating that all groups showed a reduction of escape responses over time as they switched to an avoidance strategy (F (4,104) ϭ 19.0, p Ͻ 0.001). However, there was no significant effect of stimulation (F (2,26) ϭ 1.15, p ϭ 0.33) and no significant interaction (F (8,104) ϭ 0.87, p ϭ 0.60). Dunnett's post hoc tests confirmed that neither VTA stimulation ( p ϭ 0.94) nor LHb stimulation ( p ϭ 0.36) caused a significant shift in escape behavior from that of nonstimulated controls. If anything, the trend was for more escape responding in the LHb-stimulated group (Fig. 5). This effectively rules out the possibility that LHb stimulation resulted in a generalized motivational or motor deficit, which would have manifested as impaired escape behavior. On the contrary, it appears that stimulating the LHb after avoidance responses resulted in a selective impairment of avoidance learning.
It is still possible, however, that increased activity in the VTAstimulated group could have contributed to its superior performance. To evaluate general activity level, we quantified the number of hurdle crosses during the intertrial interval, i.e., spontaneous crossing not evoked by the tone CS or the shock US. The total intertrial crosses per session were evaluated with a 3 ϫ 5 (stimulation condition ϫ training session) repeated-measures ANOVA, with session serving as the repeated measure. The ANOVA revealed a significant main effect of stimulation (F (2,26) ϭ 6.85, p ϭ 0.004) with no main effect of session (F (4,104) ϭ 1.38, p ϭ 0.25) and no significant interaction (F (8,104) ϭ 1.28, p ϭ 0.26). Figure 5 shows that the VTA-stimulated group engaged in much more spontaneous crossing than the other two groups. Dunnett's post hoc tests confirmed that VTA-stimulated animals performed significantly more intertrial crosses ( p ϭ 0.01), whereas LHb-stimulated animals were not significantly different ( p ϭ 0.57) from unstimulated controls. Thus, VTA stimulation appears to cause a significant increase in general motor activity. The question then becomes, can the apparent enhancement in avoidance learning in the VTA-stimulated group be explained by a simple increase in motor activity, or are the learning and motor effects independent of one another?
To address this question, we reevaluated the impact of VTA stimulation on avoidance learning, this time controlling for the effect of increased motor activity using a 2 ϫ 5 (stimulation condition ϫ training session) ANCOVA, contrasting VTA stimulation with no stimulation across sessions, using total intertrial crosses as a covariate indicative of general activity level. The ANCOVA showed a trend toward a relationship between general activity and avoidance responses (F (1,18) ϭ 3.917, p ϭ 0.063), but after partialling out the covariance between these two measures, the effect of VTA stimulation on avoidance responding remained significant (F (1,18) ϭ 5.97, p ϭ 0.025). On the other hand, there was a strong relationship between general activity and avoidance latencies (F (1,18) ϭ 16.6, p ϭ 0.001), and controlling for this relationship eliminated the significant effect of VTA stimulation on avoidance latency (F (1,18) ϭ 2.87, p ϭ 0.11). However, the learning acceleration effect associated with VTA stimulation in the first session was still apparent after covarying for spontaneous intertrial crossing, with a significant stimulation ϫ trials interaction for both measures of avoidance responding and latency (F (5,90) ϭ 2.76 and 2.74, p ϭ 0.023 and 0.024, respectively). Thus, VTA stimulation improved avoidance learning beyond what was predicted by a simple motor-activation mechanism, although such a mechanism could account for the enhanced reaction times displayed by subjects in later sessions.

Omission of brain stimulation
For subjects that received VTA or LHb stimulation during the first five sessions analyzed above, training continued for an additional five sessions without brain stimulation to evaluate whether the gain from VTA stimulation or the deficit from LHb stimulation was reversible. Paired-samples t tests were used to compare session 5 (the final training session with stimulation) vs session 10 (the final training session without stimulation). There was a significant worsening of performance when VTA stimulation was omitted (Fig. 6), as indicated by a significant 16% reduction in avoidance responding and a 29% increase in avoidance latency (t (6) ϭ 2.88 and 3.55, p ϭ 0.03 and 0.01, respectively). Interestingly, without VTA stimulation, this group regressed to a level of performance equivalent to that of the sham control (80% vs 81% avoidance and 3.5 s vs 3.6 s latency). On the other hand, neither measure changed significantly following the omission of LHb stimulation (t (7) ϭ 0.727 and 0.473, p ϭ 0.49 and 0.65, for avoidance and latency, respectively). In fact, even after five extra training sessions, the group that had received LHb stimulation was still performing significantly worse than the sham control at the end of the first half of training (t (20) ϭ 2.68 and 2.57, p ϭ 0.01 and 0.02, for avoidance and latency, respectively). Thus, the effect of VTA stimulation was partially reversible, while the effect of LHb stimulation was not.

LHb stimulation after acquisition
To further address the question of whether the effect of LHb stimulation was actually an impairment of learning versus a direct suppression of avoidance behavior, a group implanted with LHb electrodes was trained initially without stimulation until learning approached the asymptote of 80 -90% successful responding after five sessions. Then LHb stimulation was given for five additional training sessions. No decrement in performance was observed during these sessions (Fig. 7). On the contrary, there appeared to be a small improvement, though repeatedmeasures ANOVAs determined this change to be statistically not significant (F (5,35) ϭ 1.67 and 1.52, p ϭ 0.17 and 0.21, for avoidance and latency, respectively). Thus, LHb stimulation did not affect the performance of a previously learned avoidance response.

Discussion
The results support the general conclusion that VTA and LHb stimulation have opponent effects on avoidance learning. During acquisition, LHb stimulation impaired avoidance, while VTA stimulation improved it. The effects of LHb stimulation were long lasting, but LHb stimulation had no effect when given after successful avoidance acquisition. Thus, LHb stimulation specifically impaired avoidance acquisition without affecting memory retrieval, motivation, or the sensory-motor ability to perform the avoidance response. On the other hand, VTA stimulation improved both acquisition and motivation, as VTA-stimulated animals not only acquired the avoidance response faster but also reached above-normal levels of performance, which reverted to normal levels of responding when stimulation was discontinued.
An opponent relationship between the LHb and VTA dopaminergic neurons was first suggested by 2-deoxyglucose studies, which concluded that, out of all brain regions, the LHb shows the greatest response to various dopamine manipulations, always in a direction opposite to the dopamine signal (Wechsler et al., 1979;McCulloch et al., 1980;Wooten and Collins, 1981;Gomita and Gallistel, 1982;Pizzolato et al., 1984). Specifically, amphetamine and apomorphine reduced LHb metabolism (Wechsler et al., 1979;McCulloch et al., 1980), while haloperidol and dopaminergic lesions increased LHb metabolism (McCulloch et al., 1980;Wooten and Collins, 1981;Pizzolato et al., 1984). Gomita and Gallistel (1982) further showed that medial forebrain bundle (MFB) self-stimulation suppressed LHb activity and that the neuroleptic pimozide dramatically elevated LHb activity and eliminated self-stimulating behavior. This led the authors to hypothesize that the LHb could block reinforcement signals originating in the VTA. Supporting evidence for this hypothesis came from Sutherland and Nakajima (1981), who found that lesioning the LHb increased rates of MFB self-stimulation, and from Christoph et al. (1986), who first showed that electrical stimulation of the LHb inhibits 91% of VTA dopaminergic neurons, a finding recently replicated in the rat (Ji and Shepard, 2007) and extended to the primate (Matsumoto and Hikosaka, 2007). Thus, the LHb and VTA engage in a mutually inhibitory relationship, with activation of the dopaminergic reward system suppressing the LHb and activation of the LHb suppressing the dopaminergic reward system. Our findings of a beneficial effect of VTA stimulation and a detrimental effect of LHb stimulation on avoidance acquisition are consistent with this opponent relationship, though only VTA stimulation affected general motor activity and postacquisition performance. These asymmetrical effects could  be attributed to asymmetrical stimulation, in that the VTA group performed more successful responses and therefore received more stimulation than the LHb group. This may have increased the probability of nonspecific motor performance effects in the VTA group.
It is well established that electrical stimulation of the VTA and MFB reinforces behavior (Wise and Rompre, 1989), and it seems likely that the performance enhancement from VTA stimulation combines the positive reinforcement of brain stimulation with the negative reinforcement of shock avoidance, which may itself depend on VTA-dopamine signaling. Several studies have shown that dopamine antagonists disrupt active avoidance behavior, whether administered systemically or directly into the nucleus accumbens (reviewed by Salamone, 1994), and lever pressing to avoid shock is accompanied by substantially increased dopamine release in the accumbens, comparable to that observed during lever pressing for food (McCullough et al., 1993). Moreover, this dopamine increase is specifically correlated with avoidance responding and not escape responding or amount of shock received (McCullough et al., 1993). Shuttle-box avoidance learning in gerbils is also accompanied by increased dopamine release in medial prefrontal cortex, which is maximal during early acquisition and continually decreasing during subsequent retrieval (Stark et al., 1999(Stark et al., , 2000(Stark et al., , 2001(Stark et al., , 2004. And in humans, obtaining a monetary reward or avoiding a monetary loss evokes similar activity in medial orbitofrontal cortex, a region that appears to encode reward value (Kim et al., 2006). Thus, signals of positive and negative reinforcement may use a common dopaminergic reward pathway so that VTA stimulation following successful avoidance may boost the naturally occurring signal that occurs when an aversive outcome is averted.
Regarding the interpretation of the effects of LHb stimulation, several studies have explored the effects of habenula lesions on avoidance learning with mixed results. On the one hand, habenula lesions had no effect on learning a standard one-way active avoidance task, but they impaired performance when training was made more difficult and stressful (Thornton and Bradbury, 1989). On the other hand, habenula lesions markedly enhanced learning of a standard two-way active avoidance task (Van Hoesen et al., 1969;Wilson et al., 1972), but under very difficult training parameters, learning enhancement was minimal and limited to the second training session (Vale-Martínez et al., 1997). The reason for the discrepant lesion effects in one-versus two-way active avoidance is unclear, but it is noteworthy that studies that used two-way active avoidance with training parameters similar to ours found lesion effects opposite to our stimulation effects (Van Hoesen et al., 1969;Wilson et al., 1972).
One simple explanation is that LHb stimulation was aversive and punished the avoidance response. While LHb stimulation caused no overt change in behavior (such as running, jumping, or freezing) that one might expect from aversive brain stimulation, our stimulation may have been too brief to elicit such gross reactions. If LHb stimulation were aversive, one might further expect suppressed avoidance responding when stimulation was initiated late in training. No such suppression was observed, but it is possible that avoidance responding had simply become habitual by this point and difficult to modify. In conclusion, although there is no evidence of an aversive effect of LHb stimulation in our study or in the literature (Sutherland and Nakajima, 1981), we cannot rule out this explanation.
Another simple explanation is that stimulation of the LHb induced an analgesic effect. Long-lasting analgesia has been observed following electrical stimulation of the LHb (Mahieux and Benabid, 1987). However, these authors used a 60 s train of stimulation as opposed to our 0.2 s train, and the most significant analgesia was established 40 -90 min following stimulation. Thus, our brief stimulation should have been insufficient to produce continued analgesia, and, if analgesia did occur, it should have happened after the end of each training session. Moreover, while we did not perform any quantitative test of nociception, we did verify that gerbils receiving LHb stimulation continued to show an aversive reaction to the foot shocks with normal escape behavior.
Assuming that aversive or analgesic effects could be ruled out, how could LHb stimulation, paired with successful avoidance responses, suppress avoidance learning? Three studies using either rodents or primates have established that electrical stimulation of the LHb suppresses the firing of dopaminergic neurons in the VTA (Christoph et al., 1986;Ji and Shepard, 2007;Matsumoto and Hikosaka, 2007), and transient inhibition of the LHb may be necessary for increased dopamine release to occur (Lecourtier et al., 2008). Thus, our LHb stimulation parameters should have diminished the dopamine reward signal at the moment of negative reinforcement. Although few groups have given research attention to the potential importance of the dopaminergic reward pathway in avoidance learning, the mesocorticolimbic dopamine system is indeed activated by this paradigm (McCullough et al., 1993;Stark et al., 1999Stark et al., , 2000Stark et al., , 2001Stark et al., , 2004Coco and Weiss, 2005). If, as these studies suggest, the dopamine signal is important for modulating the operant significance of the avoidance response and modifying an animal's coping strategy, then LHb stimulation could interfere with this process and thereby retard avoidance learning.
However, while the present data demonstrate an opposite neuromodulatory effect of VTA and LHb stimulation on avoidance learning, the findings cannot confirm a dopaminergic mechanism or rule out an alternative one. The LHb influences many ascending neuromodulatory systems of the brainstem. For example, LHb stimulation has been found to inhibit serotonin neurons of the raphe nuclei (Wang and Aghajanian, 1977;Stern et al., 1979;Ferraro et al., 1997) and to increase noradrenaline and acetylcholine release in the forebrain (Kalén et al., 1989;Nilsson et al., 1990;Cenci et al., 1992). Any one or combination of these effects, not to mention unknown stimulation effects, could account for the observed differences in behavior, and more experimental work is needed to test which stimulation-induced brain changes are relevant for the avoidance learning impairment.
In conclusion, the present results are consistent with opponent roles for the LHb and VTA during the acquisition phase of learning but not during retrieval. Just as activation of the VTA may help amplify novel behaviors and strategies that are relevant and useful, activation of the LHb may help suppress acquisition of behaviors and strategies that are irrelevant and useless. More broadly, these experiments illustrate the potential investigational utility of delivering electrical brain stimulation in conjunction with particular learning events. In this case, when the VTA was stimulated at a time point when its activation was hypothetically appropriate, learning was facilitated; when the LHb was stimulated at a time point when its activation was hypothetically inappropriate, learning was retarded. Moreover, in the postacquisition phase when dopamine release is normally minimal, LHb stimulation would be neither appropriate nor inappropriate and, consistent with this hypothesis, did not affect avoidance performance.