Abstract
Phasic dopamine (DA) transmission encodes the value of reward-predictive stimuli and influences both learning and decision-making. Altered DA signaling is associated with psychiatric conditions characterized by risky choices such as pathological gambling. These observations highlight the importance of understanding how DA neuron activity is modulated. While excitatory drive onto DA neurons is critical for generating phasic DA responses, emerging evidence suggests that inhibitory signaling also modulates these responses. To address the functional importance of inhibitory signaling in DA neurons, we generated mice lacking the β3 subunit of the GABAA receptor specifically in DA neurons (β3-KO mice) and examined their behavior in tasks that assessed appetitive learning, aversive learning, and risk preference. DA neurons in midbrain slices from β3-KO mice exhibited attenuated GABA-evoked IPSCs. Furthermore, electrical stimulation of excitatory afferents to DA neurons elicited more DA release in the nucleus accumbens of β3-KO mice as measured by fast-scan cyclic voltammetry. β3-KO mice were more active than controls when given morphine, which correlated with potential compensatory upregulation of GABAergic tone onto DA neurons. β3-KO mice learned faster in two food-reinforced learning paradigms, but extinguished their learned behavior normally. Enhanced learning was specific for appetitive tasks, as aversive learning was unaffected in β3-KO mice. Finally, we found that β3-KO mice had enhanced risk preference in a probabilistic selection task that required mice to choose between a small certain reward and a larger uncertain reward. Collectively, these findings identify a selective role for GABAA signaling in DA neurons in appetitive learning and decision-making.
Introduction
While reward-predictive dopamine (DA) neuron responses have been well characterized, the mechanisms underlying phasic DA transmission remain incompletely understood. Many glutamatergic nuclei converge upon the midbrain DA neurons (Geisler et al., 2007) and the rapid activation of DA neurons critically depends upon intact inputs from hindbrain glutamatergic and cholinergic nuclei such as the laterodorsal and pedunculopontine tegmentum (PPTg) (Lokwan et al., 1999; Pan and Hyland, 2005; Lodge and Grace, 2006; Zweifel et al., 2009). Similar to DA neurons, PPTg neurons are activated by rewards and reward-predictive stimuli in primates (Okada et al., 2009). However, reward-predictive stimuli elicit sustained activity in cue-selective PPTg neurons but only a transient spike in DA neuron activity (Schultz et al., 1997; Okada et al., 2009). Furthermore, PPTg neurons are activated by sensory stimuli even before they become reward-associated (Pan and Hyland, 2005). Finally, there is no evidence that PPTg neurons respond the same as DA neurons upon the omission of an expected reward or to aversive stimuli (Waelti et al., 2001; Ungless et al., 2004; Jhou et al., 2009). The discrepancies between the PPTg and DA neuron activity patterns implicate other brain nuclei in the modulation of reward-predictive signals in DA neurons.
Recent evidence suggests that inhibitory transmission modulates reward-predictive DA responses. For instance, activity patterns that are opposite to DA neurons during Pavlovian tasks have been shown to occur in both the lateral habenula (LHb) and the rostromedial tegmentum (RMTg). LHb neurons project to both the ventral tegmental area (VTA) and the RMTg, and the RMTg is a primary source of GABAergic input to DA neurons (Matsumoto and Hikosaka, 2007; Jhou et al., 2009). Furthermore, neurons that indirectly project to the LHb scale their activity with the probability of reward delivery for a given stimulus (Hong and Hikosaka, 2008; Oyama et al., 2010), similar to DA neurons (Fiorillo et al., 2003). Reduction of GABAergic tone onto DA neurons can itself be reinforcing as exemplified by the addictive properties of opiates and some benzodiazepines (Johnson and North, 1992; Tan et al., 2010). These findings suggest that encoded information about rewards and their predictive stimuli is transmitted to DA neurons through GABAergic in addition to glutamatergic modulation.
Because DA neuron responses to predictive stimuli occur within 50–150 ms (Schultz et al., 1997), GABAergic modulation of these responses is likely occurring through ionotropic GABAA receptors (GABAA-R). To address the contribution of GABAA-R in DA neurons to behavior, we generated mice lacking the GABAA-R β3 subunit specifically in DA neurons because its mRNA is expressed in DA neurons (Okada et al., 2004). Moreover, deletion of the Gabrb3 gene dramatically attenuates GABA-evoked IPSCs in cultured neurons, an effect that is, in part, due to the reduced surface expression of non-β3 GABAA-R subunits, suggesting a role for the β3 subunit in trafficking mature GABAA-R to the cell membrane (Ramadan et al., 2003). Given the importance of the GABAA-R β3 subunit in GABAA-R signaling, we explored the behavioral consequences of selective Gabrb3 deletion from DA neurons using our β3-KO mice.
Materials and Methods
Animals.
All experiments were performed in accordance with the policies of the Institutional Animal Care and Use Committee at the University of Washington. Mice with a conditional Gabrb3 allele (Gabrb3lox/lox) and mice expressing Cre-recombinase from the DA transporter promoter Slc6a3Cre/+ were obtained from Dr. Gregg Homanics (University of Pittsburgh, Pittsburgh, PA) and Dr. Xiaoxi Zhuang (University of Chicago, Chicago, IL), respectively (Zhuang et al., 2005; Ferguson et al., 2007). β3-KO and control mice were generated by crossing Gabrb3lox/lox mice with mice that were heterozygous for β3 and had one DA transporter allele driving expression of Cre recombinase (Gabrb3Δ/+; Slc6a3 Cre/+). We chose this breeding strategy to avoid any recombination in the germline of mice expressing Cre recombinase. All animals were extensively crossed to a C57BL/6 background (>5 generations). β3-KO mice and their littermate controls between 8 and 16 weeks of age were used for behavioral testing. For instrumental conditioning and the switching task, animals were maintained at ∼85% ad libitum body weight by rationing normal chow (LabDiet 5053) based upon the monitored daily food intake and body weight of each mouse. For the probabilistic selection task, mice were maintained at ∼90% ad libitum body weight.
Electrophysiology.
All electrophysiology experiments were performed using VTA-containing brain slices from male mice as described previously (Wanat et al., 2008). Briefly, mice were given pentobarbital until loss of righting reflex, and horizontal brain slices (300 μm) were prepared in a chilled solution containing the following (in mm): 87 NaCl, 2.5 KCl, 1.25 NaH2PO4, 25 NaHCO3, 0.5 CaCl2, 7 MgCl2, and 75 sucrose. Slices were recovered over 30 min at 32°C in artificial CSF (ACSF) (pH = 7.4; 295–305 mOsm) containing the following (in mm): 124 NaCl, 5 KCl, 1.25 NaH2PO4, 2 MgCl2, 2 CaCl2, 10 d-glucose, and 26 NaHCO3. Whole-cell patch-clamp recordings were made with an Axopatch 700A amplifier (Molecular Devices) with filtering at 1 kHz, using 2–5 MΩ electrodes. For miniature IPSC (mIPSC)- and GABA-evoked current recordings, electrodes were filled with an internal solution (pH = 7.2–7.4, 260–280 mOsm) containing the following (in mm): 44 KCl, 1 CaCl2, 3.45 potassium-1,2-bis(2-aminophenoxy)-ethane-N,N,N,N-tetraacetic acid, and 10 HEPES. For miniature EPSC (mEPSC) recordings, electrodes were filled with an internal solution (pH = 7.2–7.4, 260–280 mOsm) containing the following (in mm): 120 cesium methanesulfonate, 10 HEPES, 0.2 EGTA, 8 NaCl, 2 MgCl2, 2.5 Mg-ATP, and 0.25 Na-GTP. ACSF at room temperature was continuously perfused at a rate of ∼2.0 ml/min. Clampex (Molecular Devices) was used for data acquisition. Neurons with a large hyperpolarization-induced cation current (Ih) were deduced to be DA neurons and neurons lacking Ih were treated as non-DA neurons. Although it is reported in rats that some non-DA neurons have Ih (Margolis et al., 2006) and that mesocortical DA neurons lack Ih in mice (Lammel et al., 2008), the presence of Ih is still a strong predictor of DA neurons in mice (Wanat et al., 2008; Zhang et al., 2010).
mIPSCs and mEPSCs were collected from neurons held at −70 mV for 5 min or up to 500 detected events as described previously (Bamford et al., 2004) and then analyzed with Mini Analysis (Synaptosoft). Each event was visually inspected to prevent noise disturbance of the analysis. mIPSC experiments were performed in the presence of d-2-amino-5-phosphonopentanoic acid (d-APV; 100 μm), 6-cyano-2,3-dihydroxy-7-nitro-quinoxaline (CNQX; 10 μm), strychnine (1 μm), eticlopride (100 nm), and CGP-52432 (10 μm) to block ionotropic glutamate, glycine, DA D2, and GABAB receptors, respectively. Tetrodotoxin (500 nm) was also included in the ACSF to block sodium channels. With this combination of drugs, application of GABA (1 mm) for 30 s was used to elicit GABAA-R-mediated currents in neurons held at −70 mV. The peak GABAA-R-mediated current was determined by the maximal holding current during GABA bath application. mEPSC experiments were performed in the presence of tetrodotoxin (500 nm) and picrotoxin (100 μm). All drugs were obtained from Sigma, except for d-APV, CNQX, tetrodotoxin, CGP-52432, and picrotoxin, which came from Tocris Bioscience.
mIPSC amplitudes were analyzed by largest amplitude count-matching (Stell and Mody, 2002) by comparing the average amplitude of the total number of events in the first minute of recording in β3-KO mice (n = 10 cells; 10 min; 1317 events) with the average amplitude of the of the largest 1317 events occurring within the first 1.11 min of recording in controls (n = 9 cells; 10 min; 1317 events).
Fast-scan cyclic voltammetry to measure DA release.
Fast-scan cyclic voltammetry (FSCV) was performed as in a previous study (Zweifel et al., 2009). Briefly, male mice were anesthetized with 1.5 g/kg urethane in their home cage. After 1 h, the animal's corneal reflex was tested to ensure a deep level of anesthesia. Animals were placed into a stereotaxic apparatus and a glass-encased, carbon-fiber was lowered into the NAc core [anterior–posterior (AP), 1.52 mm from bregma; medial–lateral (ML), 1.15 mm from bregma; and dorsal–ventral (DV), −3.75 mm from dura]. A stimulating electrode was incrementally lowered into the PPTg (AP, −0.68 mm from lambda; ML, 0.7 mm from bregma) until a stimulation-elicited DA signal was observed. In this experiment, the average DV coordinate was −2.6 mm from dura. Bipolar stimulating electrodes (Plastics One, 0.15 mm diameter, stainless steel, and bipolar) were connected to an ISO-flex stimulation system (A.M.P.I.). Once electrodes were in place, stimulation current was varied from 25 to 400 μA (60 Hz and 1 s duration). Stimulus duration was then varied from 1 s to 16.7 ms (400 μA and 60 Hz). Then stimulation frequency was varied from 60 to 5 Hz (200 μA and 0.5 s). Two stimulations were done at each stimulus condition and the resulting DA responses were averaged.
Locomotion.
Overnight locomotor activity was monitored using infrared beams in activity chambers (Columbus Instruments). For basal activity, individually housed male and female mice were placed into the chambers with food and water ad libitum and activity was monitored for three consecutive 24 h periods. For morphine-induced locomotion, male and female mice were individually housed in a chamber with no food or water and injected 3 h later with either saline or morphine. Activity was then monitored for 3 h. Animals received three sessions of intraperitoneal saline injections followed by two sessions of intraperitoneal morphine injections (12.5 and then 25 mg/kg). The last of the three saline injections was used as a baseline for subsequent morphine injections.
Rotarod performance.
Male and female mice were tested for their ability to learn in a motor coordination task by placing them on a rotating rod (4 cm diameter) that incrementally accelerated from 5 to 50 rpm over 3 min. Mice were given 3 sessions per day with ∼5 min between sessions. Rotarod performance was assessed on three consecutive days as latency to fall from or to clasp the rotating rod.
T-maze.
Rewarded (Rew+) and unrewarded (Rew−) arms of the T-maze were randomly assigned for each animal and remained the same throughout training. Each arm contained a wire mesh at the end, with a 20 mg reward pellet above (accessible) or below (inaccessible) for the Rew+ or Rew− sides, respectively. Both male and female mice were used in this paradigm. Mice began each trial behind a starting gate at the base of the T-maze. Once the gate was lifted the mice were allowed 60 s to make a choice (>50% of their body across the entry to either the Rew+ or Rew− arm). Once a choice was made, the maze arm was blocked and the mice were allowed to consume the reward (Rew+ choice) or were given a 60 s timeout (Rew− choice). Mice were given 10 trials per day for 10 d. After training, mice were given 4 reversal learning sessions during which the Rew+ and Rew− arms were switched.
Instrumental conditioning.
All behavioral testing was conducted in operant conditioning chambers (ENV-307W, Med Associates) using both male and female mice. Mice were given a 30 min magazine training session in which 10 pellets were freely available in the food receptacle. Mice then underwent ten 60 min instrumental conditioning sessions. The beginning of each trial was signaled by a flashing light over the food receptacle. Once the animal made a nosepoke into the food receptacle, one of two levers was extended into the chamber. There was no time limit to perform the nosepoke or to press the lever in this task. Each lever press was reinforced (fixed ratio:1) with a 20 mg food pellet (Bio-Serv) and followed by a 10 s intertrial interval (ITI). A subset of mice underwent 5 extinction sessions during which lever pressing was unrewarded.
Probabilistic selection task.
Male and female mice were trained under the same conditions as in the instrumental conditioning task. The training procedure was modified from a similar procedure used for rats (Nasrallah et al., 2009). The ITI was progressively extended to 60 s and the time limit for initiating a trial and executing the lever press was reduced to 10 s. The nosepoke requirement at the beginning of the trial was necessary to center the animal and avoid any possible positional bias in the subject's lever preference. Failure to initiate or failure to press the lever in 10 s resulted in a missed trial and a 60 s timeout. Animals received two sessions per day of 24 trials each. Once the mice successfully completed 20 or more of the 24 trials, one lever was randomly assigned to deliver two food pellets and one to continue delivering a single food pellet. Mice received two training sessions per day. The first session consisted of 24 forced trials (12 high-reward and 12 low-reward lever presentations) and the second session had 24 choice trials during which both levers were presented. Animals were trained under these conditions until they had >75% preference for the high-reward lever in the second daily training session. Once the animals had a preference for the high-reward lever, the probability of reinforcement for that lever was varied across days. The probability of reinforcement was the same in the forced trial and choice trial sessions on a given day. The object of the forced trial session was to allow the animals to experience the contingencies of each lever and then to decide which lever to press in the second session based upon experience.
Switching task.
Male and female mice were trained the same as in the Probabilistic selection task except that each lever was reinforced 100% of the time. Once the animals had >75% preference for the high-reward lever, the high- and the low-reward levers were switched and the animals were given 7 d of training under the new contingency.
Contextual fear conditioning.
Male and female mice were individually placed into a square chamber with a metal grid floor (Coulbourn Instruments) and allowed to explore the context for 2 min before a single 0.7-mA, 2 s footshock. Mice were returned to their home cage 1 min after the footshock. Freezing was assessed every 5 s throughout training. Chambers were cleaned in between animals with a 1% acetic acid solution. Contextual learning was assessed the next day by placing the animals back into the chamber and monitoring freezing for 3 min.
Two-way active avoidance.
Male mice were individually placed into a two-chamber active avoidance apparatus with free access to both chambers (PACS-30, Columbus Instruments). After a 3 min habituation period animals began receiving trials in which a 7 s tone (80 dB, 2.5 kHz) was delivered with a 2 s foot shock (0.3 mA), presented during the last two seconds of the tone. Mice could avoid delivery of the foot shock by moving to the opposite side of the chamber during the 5 s cue presentation. Mice received one session per day consisting of 100 trials with a 40 s ITI for 5 consecutive days. The number of avoidances were binned into 20-trial blocks and reported as the percentage of shocks avoided per block.
Statistics.
All analyses and graphical representations were done using Microsoft Excel and GraphPad Prism unless otherwise noted. All Student's t tests were two-tailed and unpaired. Significant isolated comparisons were done using Bonferroni post hoc analyses when applicable. All statistical results are presented in the figures and their captions.
Results
DA neurons in midbrain slices from β3-KO mice have reduced GABAA-R signaling
To confirm that conditional genetic deletion of the β3 subunit attenuated GABAA-R function in DA neurons, whole-cell, voltage-clamp recordings were obtained from the VTA of midbrain slices from control (Gabrb3lox/+; Slc6a3 Cre/+) and β3-KO mice (Gabrb3lox/Δ; Slc6a3Cre/+). VTA DA neurons were identified by their large hyperpolarization-induced currents (Ih), as previously described (Wanat et al., 2008; Zhang et al., 2010). Once a DA neuron was patched, picrotoxin-sensitive spontaneous mIPSCs were monitored for a period of 5 min (Fig. 1A). There were no differences in the input resistances for DA neurons in β3-KO and control mice (210.5 ± 33.6 MΩ vs 208.3 ± 29.0 MΩ, respectively; p = 1.0). Consistent with attenuated GABAA-R signaling, the fraction mIPSCs in DA neurons with larger interevent intervals increased during the fixed recording period and the overall frequency of events decreased ∼45% in β3-KO mice compared with controls (1.2 ± 0.2 Hz vs 2.1 ± 0.4 Hz, respectively; *p < 0.05; Fig. 1A,B). Note that some statistical comparisons and subject numbers are reported in the figure legends. In addition to a decrease in frequency, we found that there was a trend toward a reduction in the overall amplitude of mIPSCs in the DA neurons of β3-KO mice (14.4 ± 1.2 pA vs 19.9 ± 3.0 pA; p = 0.1) that resulted in a significant decrease in the fraction of events occurring at larger amplitudes (Fig. 1C). Moreover, we found that the reduction in mIPSC frequency was most pronounced for large amplitude (>20 pA) events (Fig. 1D). When we corrected for the reduced mIPSC frequency in β3-KO mice using largest amplitude count-matching (Stell and Mody, 2002), there was a significant reduction in count-matched amplitude of mIPSCs in the DA neurons of β3-KO mice during the first minute of recording (14.8 ± 0.3 pA vs 30.1 ± 0.5 pA; ***p < 0.001). Together, these finding suggest that intrinsic GABAA-R signaling is disrupted in the DA neurons of β3-KO mice.
We next tested the sensitivity of DA neurons in β3-KO mice to the direct application of exogenous GABA (1 mm). In agreement with previous reports using cultured neurons from total GABAA-R β3 KO mice (Ramadan et al., 2003), we found that the DA neurons in our KO mice had significantly attenuated GABA-evoked whole-cell current (∼4-fold) when GABA was added to the perfusate (Fig. 1E). These results establish a functional role for GABAA-R β3 in DA neurons and confirm that the sensitivity of DA neurons to GABAergic signaling is significantly decreased in β3-KO mice.
Basal locomotor activity and rotarod performance are normal while morphine-induced locomotion is enhanced in β3-KO mice
Because DA signaling is critical for locomotor activity (Zhou and Palmiter, 1995), we monitored the activity of β3-KO and control mice for 3 d. Both β3-KO and control mice had similar patterns and levels of activity (Fig. 2A). To examine these animals in a more rigorous assessment of locomotor function, we tested β3-KO and control mice on the rotarod. Both groups enhanced their performance on the rotarod similarly across training sessions (Fig. 2B).
We next asked whether manipulating GABAergic transmission onto DA neurons would alter the locomotor behavior of β3-KO mice by administering morphine, which is known to attenuate GABAergic tone onto DA neurons by activating inhibitory μ-opioid receptors on nearby GABAergic neurons (Johnson and North, 1992). By disinhibiting DA neurons, morphine enhances DA release in target regions (Pontieri et al., 1995) which is required for the locomotion-inducing effect of morphine in mice (Hnasko et al., 2005). We found that β3-KO mice had more robust locomotor activation (∼3-fold) in response to morphine resulting in a significant leftward shift in their dose–response curve compared with controls (Fig. 2C,D). Together, these findings suggest that compensatory changes allow normal DA signaling and behavior under basal conditions despite a ∼4-fold reduction in GABAA-R signaling in the DA neurons of β3-KO mice. The hyperactivity of β3-KO mice in response to morphine suggests that DA neurons are more easily excited when GABA release is attenuated by morphine administration. Thus, the disrupted GABAA-R signaling in the DA neurons of β3-KO mice becomes behaviorally evident when their DA system is pharmacologically challenged.
DA neurons have normal mEPSCs but non-DA neurons in the VTA of β3-KO mice smaller mIPSCs
Since other mouse models of hyperdopaminergia exhibit some level of hyperactivity (Giros et al., 1996; Bello et al., 2011), the finding that β3-KO mice had normal basal activity and only became hyperactive when GABAergic tone onto DA neurons was pharmacologically attenuated suggests that compensatory changes may have occurred within their DA system. One possibility is that excitatory drive onto DA neurons is attenuated in β3-KO mice. To assess whether excitatory synaptic activity is altered in the DA neurons of β3-KO mice, we monitored mEPSCs in these neurons in control and β3-KO mice. There were no significant differences between control and KO mice in either the frequency or amplitude of mEPSCs as indicated by similar distributions in their interevent intervals and event amplitudes (Fig. 3A,B). Therefore, compensatory changes in the intrinsic excitatory synaptic activity in the DA neurons of β3-KO mice do not appear to account for their normal basal activity.
Another possible compensatory mechanism might be the upregulation of GABAergic signaling onto the DA neurons of β3-KO mice. This could occur by enhanced activity in the local GABAergic neurons of the VTA (Steffensen et al., 1998), which are known exert tonic inhibition onto DA neurons (Johnson and North, 1992). To test this hypothesis, we monitored mIPSCs in neurons lacking Ih current. Although there was no difference in the frequency of mIPSCs in these neurons (Fig. 3C), there was a marked decrease in the amplitude of inhibitory events in the non-DA neurons of β3-KO mice (Fig. 3D). These results suggest that compensatory changes may have occurred in the non-DA neurons of VTA in β3-KO mice, and these changes might allow for increased GABAergic transmission onto the DA neurons of these mice. This observation could account for the normal basal locomotor activity of β3-KO mice and may explain why these mice become more active than controls when the neurons that provide GABAergic input to DA neurons are inhibited by morphine.
β3-KO mice have increased DA release in the NAc in response to PPTg stimulation
Previous work has demonstrated that disinhibiting DA neurons by attenuating GABAA-R signaling elicits bursts of action potentials in these neurons (Tepper and Lee, 2007; Lobb et al., 2011). Because burst-firing in DA neurons contributes to phasic DA release (Zweifel et al., 2009), we predicted that phasic DA release in response to excitatory input would be enhanced in β3-KO mice. To test this hypothesis, we stimulated glutamatergic afferents to DA neurons and monitored DA release in the nucleus accumbens using fast-scan cyclic voltammetry. We targeted the PPTg, which is a hindbrain nucleus that sends glutamatergic and cholinergic projections to midbrain DA neurons (Geisler et al., 2007). Electrical stimulation of the PPTg elicits burst firing in DA neurons as well as phasic DA release in the striatum (Zweifel et al., 2009). Furthermore, PPTg stimulation elicits GABAA-R-dependent current in the majority of VTA DA neurons in rat brain sections with anatomically preserved PPTg and VTA connectivity (Good and Lupica, 2009). Given that the DA neurons of β3-KO are less responsive to GABA, we predicted that electrically stimulating the PPTg in these animals would lead to elevated DA release in target regions. We recorded nucleus accumbens DA and observed a significant 2- to 5-fold increase in DA release in β3-KO relative to control mice at varying stimulus intensities, durations, and frequencies (Fig. 4A–C). These observations suggest that the excitation of DA neurons by glutamate and acetylcholine (Good and Lupica, 2009) is modulated by GABAA-R and that attenuating GABAA-R signaling disrupts inhibitory input to DA neurons and augments their excitability.
β3-KO mice have enhanced acquisition but normal reversal or extinction during appetitive learning
Because DA neurons can encode a reward-prediction error signal during appetitive learning (Pan et al., 2005, 2008), DA is thought to be critical for the acquisition and extinction of cue-reward associations. We previously found that mice with impaired phasic DA release exhibit deficits in the acquisition of some appetitive behaviors (Zweifel et al., 2009). Because β3-KO mice have enhanced phasic dopamine release, we hypothesized that these animals would have enhanced learning in these same tasks. Indeed, we found that β3-KO mice acquired a food-reinforced instrumental conditioning task significantly faster than controls (Fig. 4A). We next trained mice to discriminate between two arms of an appetitive T-Maze to obtain a food reward. Similar to our findings during instrumental conditioning, we found that β3-KO mice discriminated better than controls in the T-maze task (Fig. 4B).
Reward omission typically results in the phasic inhibition of DA neurons, and this inhibition is thought to be driven by GABA and to contribute to the extinction of learned associations (Waelti et al., 2001; Pan et al., 2008). Although GABAA-R signaling is only attenuated in β3-KO mice and not completely gone, we hypothesized that extinction would be altered in these animals. Surprisingly, we found that control and β3-KO mice performed the same under extinction conditions in the instrumental conditioning task (Fig. 4C) and reversed their discrimination normally in the T-maze task when the contingencies of the two arms were switched (Fig. 4D). Together, these results indicate that β3-KO mice acquire appetitive associations faster than controls but extinguish those associations normally.
β3-KO mice exhibit normal aversive learning
After demonstrating that β3-KO mice acquired appetitive associations faster than controls, we next compared the ability of these animals to learn from aversive stimuli. We have previously demonstrated that intact DA transmission is required for aversive learning (Fadok et al., 2009). While inhibitions in DA signaling are implicated in the processing of aversive stimuli (Ungless et al., 2004; Roitman et al., 2008; Jhou et al., 2009), whether GABAergic signaling in DA neurons contributes to learning from aversive events is not known. Nonetheless, lesioning the primary source of GABAergic input to DA neurons attenuates conditioned freezing to an auditory tone in rats (Jhou et al., 2009), so we predicted that attenuating GABAA-R signaling in DA neurons would recapitulate this phenotype and result in a deficit in contextual fear conditioning in β3-KO mice. We compared the ability of β3-KO and control mice to associate a context with footshock delivery in a single-trial contextual fear conditioning assay. Surprisingly, β3-KO and control mice exhibited the same amount of contextual freezing after receiving a single footshock (Fig. 5A).
To extend our examination of aversive learning to another learning paradigm, we examined these animals in a two-way active avoidance paradigm. During two-way active avoidance, mice were trained to escape from one compartment to another in response to a tone that predicted footshock delivery. It has been shown that DA is required for learning in this paradigm in mice (Darvas et al., 2011). Similar to contextual fear conditioning, β3-KO and control mice learned at similar rates in a two-way active avoidance paradigm (Fig. 5B). These results suggest that attenuating GABAA-R signaling in DA neurons does not affect the ability of mice to associate a context or a cue with an aversive footshock.
β3-KO mice are more risk-preferring in a probabilistic selection task
Enhanced DA transmission can promote pathological gambling in humans (Dodd et al., 2005) and is associated with increased risk-preference in rodents (St Onge et al., 2010). Furthermore the phasic activation of DA neurons in response to reward-predictive stimuli increases with the probability with which a stimulus predicts reward (Fiorillo et al., 2003). Given that inhibitory signaling may modulate the magnitude of DA responses to reward-predictive stimuli (Matsumoto and Hikosaka, 2007; Jhou et al., 2009), we hypothesized that disrupting GABAA-R signaling in DA neurons would alter the valuation of reward-predictive cues based on probability in β3-KO mice. To test this hypothesis we subjected our mice to a probabilistic selection task. First, mice were trained to discriminate between a lever that delivered one pellet and a lever that delivered two pellets (Fig. 6A). Once the animals had learned to equivalently discriminate between the two levers (Fig. 6B), the probability of reinforcement for the two-pellet lever was varied between 75% and 25%. Although both groups of mice decreased their preference for the two-pellet lever as the probability of its reinforcement decreased, β3-KO mice had a greater preference for the risky lever, even when doing so was not optimal. For instance, when the two-pellet lever was only reinforced 25% of the time, β3-KO mice still had a preference for this lever (∼60%) while control animals tended to avoid the risky lever (∼30% preference) (Fig. 6B).
To confirm that β3-KO mice were not simply perseverating on the two-pellet lever, we tested both groups of mice under conditions in which the two-pellet lever was never reinforced. Similar to our finding during extinction of instrumental conditioning, β3-KO and control mice had similar preferences for the two-pellet lever when reinforcement on this lever was reduced to 0% (Fig. 6B). To further rule out the possibility that the β3-KO mice were less able to adapt their preferences in general, we tested the ability of control and β3-KO mice to switch their lever preference in a non-probabilistic task. Mice were trained to prefer a high-reward lever and then the high- and low-reward lever were switched. Both β3-KO and control mice switched their lever preferences at comparable rates (Fig. 6C). These findings suggest that the preference for the risky lever by β3-KO mice was due to an alteration in their decision-making process when required to make a choice based upon their experience of the probabilistic contingencies of each lever during training. Importantly, this effect occurs only when probability is introduced, ruling out the possibility of simple behavioral inflexibility.
Discussion
Rapid GABAergic transmission onto DA neurons is implicated in numerous behaviors (Schultz et al., 1997; Matsumoto and Hikosaka, 2007; Pan et al., 2008; Brischoux et al., 2009; Jhou et al., 2009). Pharmacologically manipulating GABAA-R signaling in DA neurons is limited by the fact that GABAA-R are expressed in most neurons, including midbrain interneurons (Laviolette and van der Kooy, 2001; Brazhnik et al., 2008; Tan et al., 2010). Our genetic approach selectively disrupts GABAA-R in DA neurons and permits us to determine the behavioral relevance of fast GABAergic transmission in DA neurons.
The observation that DA neurons from β3-KO mice had a lower frequency of mIPSCs, a leftward shift in the distribution of mIPSC amplitudes, and an attenuation in GABA-evoked IPSCs confirms a functional role for GABAA-R β3 in DA neurons (Ramadan et al., 2003; Okada et al., 2004). Surprisingly and despite diminished GABAA-R signaling in DA neurons, basal locomotor activity was normal in β3-KO mice. This observation is similar to our previous finding that mice with attenuated glutamate signaling in DA neurons also have normal basal activity (Zweifel et al., 2009) and highlights the adaptability of this neural system. However, when β3-KO mice were challenged with morphine, they were significantly hyper-responsive. Therefore, although basal DA transmission may be normal, the effect of reduced GABAA-R in DA neurons can be functionally revealed by challenging β3-KO mice with morphine.
We hypothesized that compensatory changes had occurred in the DA system of β3-KO mice through alterations in either excitatory or inhibitory input to DA neurons. Although mEPSC amplitude and frequency were normal in the DA neurons of β3-KO mice, mIPSC amplitude was significantly attenuated in non-DA neurons, which are thought to tonically inhibit DA neurons. These results suggest that GABAergic input to DA neurons is elevated in β3-KO mice and is consistent with their hyperactive response to morphine, which stimulates locomotion by inhibiting GABAergic neurons that synapse onto DA neurons. While the molecular mechanisms of compensation remain unclear, it is unlikely that our genetic approach lacks specificity, since Cre-mediated recombination in Slc6a3Cre/+ mice has been shown to be selective for Tyrosine-Hydroxylase-immunoreactive neurons with Ih current (Zhuang et al., 2005; Zweifel et al., 2008). Because some VTA DA neurons are reported to lack Ih current (Lammel et al., 2008), it is possible that some DA neurons were included in our non-DA dataset. However, Ih-lacking DA neurons constitute a minority of the neurons in the VTA and others have shown that the absence of Ih is a reliable predictor of non-DA neurons (Margolis et al., 2006). Therefore, the decreased mIPSC amplitude observed in non-DA neurons is unlikely the result of including Ih-lacking DA neurons in the DA neuron dataset.
Despite apparent compensatory mechanisms in β3-KO mice, we predicted that diminished GABAA-R signaling in DA neurons would lead to increased PPTg-stimulation-evoked DA transmission as this phenomenon depends not only on excitation by glutamate and acetylcholine but also on GABAergic tone (Good and Lupica, 2009). Moreover, disinhibition of DA neurons by attenuating GABAA-R signaling elicits burst-firing, which contributes to phasic DA release (Tepper and Lee, 2007; Lobb et al., 2011). In agreement with GABAA-R contributing to phasic DA transmission, we found that PPTg-stimulation-evoked DA release was enhanced in β3-KO mice. Thus, attenuating GABAergic transmission in DA neurons enhances their excitability.
Given the established role of dopamine in learning (Di Ciano et al., 2001; Dalley et al., 2005; Robinson et al., 2005; Parker et al., 2010) and given that phasic DA transmission is enhanced in β3-KO mice, we assessed these animals in appetitive learning. β3-KO mice performed better in two learning paradigms that are known to be influenced by phasic DA transmission (Zweifel et al., 2009), providing behavioral evidence of enhanced phasic DA signaling in these animals. Because phasic inhibition of DA neurons occurs at the time of reward omission and because this phenomenon is thought to contribute to extinction (Waelti et al., 2001; Pan et al., 2008), we predicted that β3-KO mice would be resistant to extinction and reversal learning. Surprisingly, β3-KO mice extinguished their behaviors normally in both paradigms. At face value, this observation suggests that GABAA-R signaling in DA neurons is not important for learning from negative outcomes such as reward omission. However, there is still residual GABAA-R current (∼25%) in the DA neurons of β3-KO mice, and GABAB receptors, which are highly expressed in DA neurons (Brazhnik et al., 2008), are also intact in these animals. Therefore, GABAergic transmission onto DA neurons may still be important for extinguishing associations. Alternatively, the inhibitions in DA neurons at the time of reward omission may be driven by pauses in glutamatergic input. Future experiments looking at excitatory afferent such as the lateral hypothalamus or medial prefrontal cortex will better characterize how negative prediction errors are encoded in the brain. Another possibility is that the extinction and reversal procedures used here were too rapid to detect subtle differences due to attenuated GABAA-R signaling in DA neurons. Instead, a more gradual extinction procedure such as one in which fewer extinction or reversal trials are given for more days might reveal differences between control and β3-KO mice.
Aversive stimuli have been shown to either inhibit or excite dopamine neurons depending on the anatomical location of these cells or the targets to which they project (Inglis and Moghaddam, 1999; Ungless et al., 2004; Roitman et al., 2008; Brischoux et al., 2009; Jhou et al., 2009; Matsumoto and Hikosaka., 2009). Although we have previously demonstrated that DA neuron activation is important for fear conditioning (Zweifel et al., 2009, 2011), little is known about the role of DA neuron inhibition during aversive learning. Our finding that β3-KO mice were normal in two fear conditioning paradigms suggests that DA neuron inhibition is not required for aversive learning. Again, we cannot rule out the possibility that residual inhibition in DA neurons in β3-KO mice is sufficient to enable learning from aversive stimuli. Nonetheless, together with previous work (Zweifel et al., 2011), our findings suggest that phasic activation of DA neurons may be more important than inhibition for aversive learning.
The observation that DA neurons respond more to stimuli that predict reward with greater probability implicates DA in the representation of probability (Fiorillo et al., 2003). Furthermore, emerging evidence in both humans and rodents suggests that DA plays an important role in behaviors involving probabilistic assessment such as gambling (Dodd et al., 2005; St Onge et al., 2010; Winstanley et al., 2011). Our finding that β3-KO mice have a greater preference for the risky lever in a probabilistic selection task suggests that information encoding reward probability is relayed to DA neurons, in part, through GABAergic inputs. Attenuating the sensitivity of DA neurons to GABA signaling may have disrupted the proper integration of information about reward probability at the level of DA neurons and increased the risk preference of β3-KO mice.
If information regarding probability is relayed to DA neurons by GABAergic transmission, where does this information come from? The RMTg receives excitatory input from the LHb and has been described as the predominant source of GABAergic input to DA neurons (Jhou, 2005; Perrotti et al., 2005). Both LHb and RMTg neurons respond inversely to DA neurons during reward learning (Matsumoto and Hikosaka, 2007; Jhou et al., 2009). This circuitry has been further expanded to include the globus pallidus internal segment (GPi), which contains neurons that encode reward prediction errors and project to the LHb (Hong and Hikosaka, 2008). The primary afferent to the GPi comes from the dorsal striatum, which also contains neurons that discriminate probabilistic stimuli (Oyama et al., 2010). The activity of neurons in the dorsal striatum is modulated by risk-associated cortical structures such as the orbitofrontal cortex (Tobler et al., 2007; Schilman et al., 2008). Collectively, these observations suggest a plausible circuit by which cortical structures can modulate the dorsal striatum and its output structures to influence choice based on past experience through GABAergic signaling onto DA neurons.
Gaining better insight about this circuitry will provide more therapeutic targets for treating neuropsychological disorders characterized by irrational decision-making. Having shown here that GABAA-R signaling within DA neurons contributes to risky choice, it may be possible to treat patients with impulsive tendencies by selectively augmenting GABAA-R signaling using GABAA-R subtype-specific benzodiazepines. It was recently reported that benzodiazepines exert their reinforcing properties by reducing GABAergic transmission onto DA neurons by potentiating GABAA-R currents within the inhibitory interneurons of the VTA. This effect depended upon the expression of GABAA-R α1, which is not expressed on DA neurons (Tan et al., 2010). The predominant GABAA-R α isoforms expressed in DA neurons are the α2 and α3 (Okada et al., 2004). Thus, α2- and α3-selective benzodiazepines would theoretically have a low abuse potential and might alleviate impulsivity. The possibility of treating psychiatric illness by targeting distinct neuronal types expressing unique permutations of GABAA-R subtypes holds promise for the future of drug development.
Footnotes
This work was supported by NIDA Grants DA07278 (J.G.P. and M.E.S.) and DA024908 (R.D.P.), and NIH Grants DA026273 (M.J.W.), NS052536 (N.S.B.), MH089887 (L.S.Z.), and NS060803 and HD02274 (N.S.B.).
- Correspondence should be addressed to Dr. Richard D. Palmiter, HHMI and Department of Biochemistry, Box 357370, University of Washington, Seattle, WA 98195. palmiter{at}uw.edu