Understanding the psychobiological basis of relapse remains a challenge in developing therapies for drug addiction. Relapse in cocaine addiction often occurs following exposure to environmental stimuli previously associated with drug taking. The metabotropic glutamate receptor, mGluR5, is potentially important in this respect; it plays a central role in several forms of striatal synaptic plasticity proposed to underpin associative learning and memory processes that enable drug-paired stimuli to acquire incentive motivational properties and trigger relapse. Using cell type-specific RNA interference, we have generated a novel mouse line with a selective knock-down of mGluR5 in dopamine D1 receptor-expressing neurons. Although mutant mice self-administer cocaine, we show that reinstatement of cocaine-seeking induced by a cocaine-paired stimulus is impaired. By examining different aspects of associative learning in the mutant mice, we identify deficits in specific incentive learning processes that enable a reward-paired stimulus to directly reinforce behavior and to become attractive, thus eliciting approach toward it. Our findings show that glutamate signaling through mGluR5 located on dopamine D1 receptor-expressing neurons is necessary for incentive learning processes that contribute to cue-induced reinstatement of cocaine-seeking and which may underpin relapse in drug addiction.
The most challenging feature of cocaine addiction is the high risk of relapse even after long periods of abstinence. A common trigger of relapse in vulnerable individuals is exposure to environmental stimuli previously associated with drug use (Stewart et al., 1984). The enduring control over relapse by cocaine-paired stimuli reflects the ability of addictive drugs to hijack neural substrates of associative reward-learning and memory that normally enable environmental stimuli paired with natural rewards (e.g., food or water) to guide adaptive behaviors (Robinson and Berridge, 1993; Berke and Hyman, 2000; Kauer and Malenka, 2007). However, associative reward-learning can be dissociated into a variety of psychologically and neurobiologically distinct processes (Everitt et al., 2001). Consequently, understanding the psychobiological basis of relapse is of considerable importance for developing effective treatments for cocaine addiction.
A common neuronal substrate of associative reward-learning processes involves striatal medium spiny neurons (MSNs), which integrate mesostriatal dopaminergic signals and glutamatergic inputs arising from cortical and limbic regions (Kauer and Malenka, 2007; Goto and Grace, 2008). MSNs provide the sole striatal output to motivational and motor systems and can be divided into two functionally distinct populations, expressing either dopamine D1 (D1-MSNs) or D2 (D2-MSNs) receptors (Gerfen et al., 1990; Heiman et al., 2008; Valjent et al., 2009). However, the relative contributions of D1- and D2-MSNs to motivational output and the molecular events in MSNs underpinning associative reward-learning processes that contribute to relapse-like behaviors remain elusive.
The metabotropic glutamate receptor, mGluR5, is particularly interesting in this context. It is involved in several forms of plasticity in striatal MSNs that are proposed to mediate associative learning and memory processes (Sung et al., 2001; Gubellini et al., 2003; Malenka and Bear, 2004; Hyman et al., 2006; Schotanus and Chergui, 2008), and which are affected by cocaine experience (Martin et al., 2006; Kauer and Malenka, 2007; Kourrich et al., 2007; Bellone et al., 2008; Anwyl, 2009; Moussawi et al., 2009). Although mGluR5 is densely expressed on both D1- and D2-MSN populations (Tallaksen-Greene et al., 1998), converging lines of research would suggest that mGluR5 located specifically on D1-MSNs is ideally positioned to influence associative reward-learning processes that may underpin relapse triggered by drug-paired stimuli. First, there is evidence that striatal dopamine D1 receptors (D1R) play a critical role in both the consolidation of associative reward-learning memories (Dalley et al., 2005) and many of the long-term effects of addictive drugs (Anderson and Pierce, 2005) and second, mGluR5 appears to interact closely with D1Rs to regulate striatal neurotransmission (Paolillo et al., 1998; Voulalas et al., 2005; Schotanus and Chergui, 2008).
Here, we determine the role of mGluR5 located on dopamine D1 receptor (D1R)-expressing neurons, in behaviors influenced by drug- or natural reward-paired stimuli, by generation of a novel mouse line in which mGluR5 is selectively knocked-down in neurons expressing the D1R. These mice reveal a necessary role of mGluR5 located on D1R-expressing neurons for highly specific associative reward-learning processes underlying cue-induced reinstatement of cocaine-seeking.
Materials and Methods
Short hairpin RNAs were designed using the sFold (sTarMir) and BLOCK-IT RNAi Designer (Invitrogen) software packages and tested in cell culture for knock-down (KD) efficiency of mGluR5 mRNA. BLOCK-iT Pol II miR RNAi Expression vector kit with GW/EmGFP-miR vector (Invitrogen) was used to insert synthetic oligos to artificial miRNA context (Fig. 1B). The construct was recombined into a bacterial artificial chromosome (BAC; RP24–179E13; Children's Hospital Oakland Research Institute, Oakland, CA) harboring the mouse D1R gene following a procedure previously described (Parkitna et al., 2009) (Fig. 1A). The BAC was purified, the vector sequences were removed, and the transgene was injected into the pronuclei of fertilized oocytes from C57BL/6N mice. Experimental animals were generated by backcrossing of mGluR5KD-D1 transgenic mice to C57BL/6N line. Transgenic animals were genotyped using the following primers: ACGTAAACGGCCACAAGTTC, AAGTCGTGCTGCTTCATGTG. Food and water were provided ad libitum. KD and wild-type (WT) littermates 8–20 weeks of age were used for the neurobiological characterization of the transgenic lines.
In Situ hybridization
An ∼900-bp-long digoxigenin (DIG)-labeled riboprobe was used for mGluR5 mRNA detection. The DNA template was synthesized using the primers: ACCCCTATCTGCTCTTCCTACC and GTCTACTGAATGGAGGGACCAG. Probe was generated using a DIG RNA Labeling Kit (SP6/T7) from Roche. Brains were fixed in 4% paraformaldehyde at 4°C for 48 h and 50 μm free-floating vibratome sections were hybridized with the DIG-labeled probe at 70°C overnight. Signal was developed using alkaline phosphatase-conjugated antigen binding fragments and 5-bromo-4-chloro-3′-indolylphosphate p-toluidine salt and nitroblue tetrazolium chloride as a substrate (Roche).
RNA was isolated (RNeasy Mini Kit, QIAGEN) from striata fixed in RNAlater solution (Ambion) at 4°C overnight. cDNA was synthesized using 250 ng of total RNA as template and oligo-dT reverse-transcription primer (TaqMan Reverse Transcription Reagents, Applied Biosystems). The quantity of specific transcripts was measured using the TaqMan gene expression assays against mGluR5 (Mm01317988_m1), Hprt1 (Mm01545399_m1), Gfap (Mm00546086_m1) and a custom assay for EmGFP. The quantification of mature microRNAs in the striatum was performed on samples containing only small RNAs (<200 nt) isolated using mirVana miRNA Isolation Kit (Ambion, catalog #AM1561). Removal of ribosomal RNA was verified on RNA LabChips. Small RNAs were detected by quantitative PCR using MicroRNA Reverse Transcription kit (Applied Biosystems) and microRNA detection assays: mmu-miR-9 (part #4373371), hsa-miR-15a (4373123), hsa-miR-16 (4373121), mmu-miR-124a (4373150), hsa-miR-138 (4373175), snoRNA-234 (4380915) on 10 ng of the small RNA sample.
Immunohistochemistry and immunofluorescence
Immunohistochemistry with anti-GFP antibody (1:10 000, Invitrogen, A11122, Lot 50434A) was performed using avidin-biotin-peroxidase complex (ABC) amplification and diaminodbenzidine as a substrate. For immunofluorescence we used: rabbit anti-GFP (1:1000, Invitrogen, see above), donkey anti-rabbit Alexa Fluor 488 (1:100, Invitrogen, A21206), chicken anti-GFP (1:1000, Abcam, ab13970), donkey anti-chicken Alexa Fluor 488, rabbit anti-prepro enkephalin (Neuromics, RA15125), goat anti-rabbit Alexa Fluor 568, mouse anti-DARPP-32 (BD Transduction Laboratories, 611520), mouse anti-NeuN (1:400, Millipore Bioscience Research Reagents, MAB377, Lot 0604027006), Cy5-conjugated anti-mouse (Jackson ImmunoResearch), chicken anti-mouse Alexa Fluor 594 (1:100, Invitrogen, A21201). Image analyses were performed with the ImageJ (v1.37, Wayne Rasband, National Institutes of Health, Bethesda, MD) and Creative Suite CS4 (Adobe) software. GFP and NeuN-positive cells were counted on 8 consecutive striatal sections per animal.
Striatal samples were homogenized and denatured at 100°C for 10 min. Protein concentration was measured using the bicinchoninic acid (BCA) assay (Sigma-Aldrich). Proteins were detected by rabbit polyclonal anti-mGluR5 Ab (1:500, Abcam, ab53090). Monoclonal mouse anti-GAPDH Ab (1:10 000, Millipore, #MAB374) was used as a loading control. The secondary antibodies used were goat anti-rabbit HRP-linked Ab (1:10 000, Cell Signaling Technology, #7074) and goat anti-mouse HRP-conjugated Ab (1:10 000, Jackson ImmunoResearch, #115-036-003). The membrane was developed with substrate ECL plus Western Blotting Reagents Mix (GE Healthcare).
Animals for behavioral analysis
Cocaine studies were conducted in Mannheim, Germany while associative learning studies took place in Brighton, UK. In both laboratories, male WT and KD mice (minimum 8 weeks old) were maintained on a 12–12 h light-dark cycle (with lights on at 7:00 AM) under controlled temperature (21 ± 2°C) and humidity (50 ± 5%) conditions. All experiments took place during the light phase. For cocaine studies, mice were single housed and for conditioning studies, mice were single or group housed. For all studies, body weights were maintained at ∼85% of ad libitum feeding weight except for the cocaine self-administration phase during which mice received ad libitum access to food. Experiments were conducted in accordance with European Union guidelines on the care and use of laboratory animals; experiments in Germany were approved by the local animal care committee (Karlsruhe, Germany); experiments in the UK were performed in accordance with the United Kingdom 1986 Animals (Scientific Procedures) Act, following institutional ethical review.
Apparatus for cocaine and associative learning studies
Behavioral training and testing were performed in mouse conditioning chambers (Med Associates), individually housed within sound and light attenuating cubicles. Each chamber was equipped with a pellet dispenser connected to a recessed food magazine. A retractable lever was located on each side of the magazine and a cue light was positioned above each lever. A tone generator was situated between the cue lights and a house light was positioned on the wall opposite to the food magazine. For the cocaine studies, polyethylene/PVC tubing connected the implanted catheter, via a swivel (Instech Solomon), to an infusion pump (PHM-100, Med Associates) located outside of the cubicle. For the sign-tracking tests, two nose-poke holes, each of which contained a cue-light, were inserted into the conditioning chamber opposite to the food magazine. Conditioning chambers were controlled and responses were recorded using a computer running Med-PC IV (Med Associates).
Lever training and surgery.
The procedures for lever training, surgery and catheter maintenance were as previously described (Mameli et al., 2009). In brief, to familiarize mice with the action of lever pressing, all mice were trained to lever press for food for a minimum of 14 sessions. The implantation of an indwelling catheter in the right jugular vein occurred 24 h after completion of lever training. Animals were given a minimum of 48 h recovery before cocaine self-administration sessions began.
Once-daily, 90 min, self-administration sessions commenced with the insertion of two levers into the conditioning chamber. Responses on one lever (the active lever), under a fixed-ratio 4 schedule (FR4), resulted in a 14–28 μl infusion of cocaine (cocaine hydrochloride; Sigma-Aldrich) delivered by activation of the pump for 1.2–2.4 s. Responses on the alternative lever (the inactive lever) were recorded, but had no scheduled consequence. Each drug infusion was associated with the 20 s presentation of flashing (1 Hz) cue lights [conditioned stimulus (CS)], which also signaled a time-out period during which further lever responses were not reinforced.
For dose–response determination, KD (n = 14) and WT (n = 14) mice were given access to different cocaine doses (0.095–1.5 mg/kg per infusion) in a randomized order during 90 min once daily self-administration sessions. When self-administration behavior was stable for one dose (three consecutive sessions with ≤± 20% variation in the number of infusions earned) mice were given access to a different cocaine dose. Data from the third stable session of self-administration, from animals with a patent catheter, were used to generate the dose–response curve.
KD (n = 7) and WT (n = 6) mice were trained to self-administer cocaine (0.75 mg/kg per infusion) during 10 consecutive sessions under identical conditions to those described above. In addition, 7 animals (n = 4/3; KD/WT) which received cocaine 0.75 mg/kg per infusion as the final dose of the dose–response study were added to this experimental cohort. After the final cocaine self-administration session, mice received 14, once daily, 90 min extinction sessions in which responses on both levers were recorded but had no scheduled consequence. Prior studies from our laboratory (unpublished) revealed that 14 extinction sessions was sufficient to produce stable lever responding with active lever responses reduced to 50% or less of responses maintained by cocaine, as well as complete loss of discrimination between the active and inactive levers. Reinstatement tests took place 24 h after the last extinction session under conditions identical to the final session of cocaine self-administration, except that cocaine was not available. Thus, responses on the previously active lever triggered the noise of the infusion pump and a brief CS presentation. Responses on the inactive lever were without consequence.
Associative learning studies
Mice were assigned to one of three experimental cohorts; one for the assessment of both goal-tracking responses and conditioned reinforcement (CRf), a second for Pavlovian-instrumental transfer (PIT) and a third for sign-tracking. The use of different Pavlovian conditioning procedures for CRf and PIT studies was in recognition of data indicating that these procedures were most suitable for supporting subsequent CRf or PIT behavior (Crombag et al., 2008).
To familiarize mice with the food used for conditioning studies (5TUL, catalog #1811142; Test Diet), a small amount of the food was given to all mice in their home cage. Mice also received a single, 30 min, magazine training session in which food pellets were delivered once every 60 s, on average (range of 25 to 95 s).
Goal-tracking and conditioned reinforcement.
The procedures for Pavlovian conditioning, goal-tracking and CRf tests were as previously described (O'Connor et al., 2010). In brief, KD (n = 12) and WT (n = 9) mice received 11, once daily, 60 min Pavlovian conditioning sessions in which 16 presentations of a 10 s stimulus paired with food delivery (CS+; flashing cue lights or constant tone) and 16 presentations of a 10 s stimulus paired with no outcome (CS−; the alternative stimulus) occurred. Each stimulus trial was separated by a variable, no stimulus, intertrial interval (ITI) [range of 80–120 s; mean (M) = 100 s]. Food delivery occurred 5 s after CS+ onset. Assessment of the acquisition of goal-tracking responses was provided by recording food magazine head entries that occurred in the first five seconds following CS+ onset (that is, before food delivery). The 45 min CRf test was undertaken 24 h after the final conditioning session and commenced with the insertion of two levers into the conditioning chamber. Responses on one lever resulted in brief presentations of the CS+, whereas responses on the alternative lever resulted in brief presentations of the CS−. No food was delivered during the CRf test.
KD (n = 9) and WT (n = 7) mice received 12, once daily, 30 min Pavlovian conditioning sessions in which four presentations of a 2 min stimulus paired with food delivery (CS+; an intermittent tone or flashing house light) occurred. Each stimulus event was separated by a variable, no-stimulus, ITI (range of 225–375 s; M = 300 s). Mice then received a further six 45 min conditioning sessions, in which two presentations of a 2 min stimulus paired with no outcome (CS−; the alternative stimulus) occurred, along with four reinforced presentations of the CS+. The order of stimulus presentations was randomly determined and each stimulus was separated by a variable, no-stimulus, ITI (range of 205–395 s; M = 300 s). Four food pellets were delivered during each CS+ presentation. Pellet delivery was equally likely to occur in each 10 s time bin throughout the CS+, although a minimum time of 10 s separated each pellet delivery.
Following Pavlovian conditioning sessions, mice were trained to lever press for food under a variable interval 60 s schedule (VI60) of reinforcement. Each food self-administration session commenced with the insertion of two levers. Responses on one lever (the active lever) resulted in food delivery, while responses on the alternative lever (the inactive lever) had no scheduled consequence. Instrumental training sessions terminated after 30 food pellets had been obtained, or 30 min had elapsed.
The PIT test commenced with the insertion of both levers and for the first 5 min, no stimuli were presented. This period was followed by 4 presentations of the 2 min CS+ and 4 presentations of the 2 min CS−, occurring in an alternating order. Each stimulus presentation was preceded by a 2 min, no-stimulus ITI. No food was delivered during the test. An elevation score was calculated to assess changes in active lever response rate during CS+ and CS− presentations (elevation score = lever responses during CS+ or CS− presentations minus lever responses during the no-stimulus ITI period before CS+ or CS− presentations, respectively).
KD (n = 12) and WT (n = 12) mice received 11, once daily, 30 min Pavlovian conditioning sessions in which 16 presentations of a 10 s stimulus paired with food delivery (CS+; flashing cue lights) occurred. Each CS+ presentation was separated by a variable, no stimulus, ITI (range of 80–120 s; M = 100 s). A single food pellet was delivered 5 s after CS+ onset. For the 45 min sign-tracking test, conducted 24 h after the final conditioning session, two nose-poke holes were inserted into the conditioning chamber. In one hole, 15 × 1 min presentations of a flashing cue light (that is, the CS+) occurred. Each CS+ presentation was separated by a 2 min no-stimulus ITI. No stimulus presentations occurred in the second (control) nose-poke hole and no food was delivered during the test. Entries into each hole were recorded during CS+ presentations, thus providing a measure of sign-tracking responses (that is, approaches) toward the CS+.
For the assessment of the knock-down efficiency by quantitative PCR and Western blotting, statistical analyses were performed using t test. For cocaine (self-administration and cue-induced reinstatement) and associative learning (goal-tracking, CRf and PIT) studies, data were initially analyzed by mixed-factor ANOVA, where genotype comparisons were represented by the between-subjects factor of genotype (WT, KD). When a significant (≤0.05) main effect or interaction term was found, further analysis was performed using ANOVA and post hoc comparisons by Newman–Keuls or t test. For the sign-tracking test, approaches toward the CS+ or a control nose-poke hole were initially compared for each genotype by Mann–Whitney U test, with comparisons of responding in each nose-poke between genotypes made by Wilcoxon matched pairs test.
Generation and validation of mice with knock-down of mGluR5 selectively in D1R-expressing neurons
To test the role of mGluR5 on D1R-expressing neurons we generated mice with a selective knock-down of mGluR5 in these cells (mGluR5KD-D1 mice). We used a construct that expresses two artificial microRNAs targeting mGluR5 mRNA under the control of the D1R promoter (Fig. 1A,B). The coding sequence for green fluorescent protein (GFP) was introduced in tandem with the microRNAs (Fig. 1A), enabling us to easily track expression of the construct. Immunostaining of GFP in brains from mGluR5KD-D1 mice showed that the expression pattern fits with that described for D1Rs, including strong expression in the dorsal striatum and nucleus accumbens (Fig. 1C). A more detailed examination of the striatum confirmed that the transgene (GFP) was expressed in ∼53% of the striatal neurons (Fig. 1D, NeuN). Furthermore, expression of the transgene was confined to MSNs (identified by immunostaining against DARPP-32) (Fig. 1E) but the transgene was not expressed in D2-MSNs (identified by immunostaining against preproenkephalin; ppEnk) (Fig. 1E), showing that expression is restricted to D1-MSNs. Next, we analyzed whether expression of the transgene reduces the abundance of the mGluR5 transcript. In situ hybridization revealed reduced numbers of mGluR5-positive cells in the striatum, while the staining-intensity in the cells still expressing mGluR5 was not reduced (Fig. 1F), indicating strong mGluR5 knock-down selectively in the targeted cells. The abundance of mGluR5 transcript was reduced to ∼40% in the homogenized striatum (Fig. 1G) with the corresponding protein reduced to ∼50% compared with levels in WT mice (Fig. 1H). Since the expression of the construct is restricted to D1-MSNs (Fig. 1E), we estimate that the knock-down efficiency is ∼90% in the targeted cells. There was no significant reduction of mGluR5 mRNA in the cerebral cortex or in the hippocampus of mGluR5KD-D1 mice (Fig. 1G).
Off-target effects (that is, knock-down of mRNAs other than mGluR5) and disruption of endogenous microRNA processing are potential concerns when using interfering RNAs. To exclude the possibility of off-target effects we measured the abundance of transcripts of other mGluR-family members and the related GABAB1 receptor (Fig. 2A). In contrast to mGluR5, the abundance of the other transcripts was normal. Further, the level of short RNAs in general, as well as the amount of several randomly selected endogenous mature microRNAs, were normal in the striatum of mGluR5KD-D1 mice (Fig. 2B) confirming normal function of the endogenous microRNA processing machinery. Together, our data indicate a highly specific and efficient knock-down of mGluR5 mRNA without off-target effects or disruption of endogenous microRNA function.
Cocaine self-administration and cocaine-seeking in mGluR5KD-D1 mice
To explore the consequence of the specific knock-down of mGluR5 for behaviors related to cocaine addiction, we first examined the propensity of mGluR5KD-D1 mice to self-administer cocaine. When given access, in a randomized order, to five different doses of cocaine under a fixed-ratio (FR4) schedule of reinforcement, WT and mGluR5KD-D1 mice displayed comparable self-administration behavior (Fig. 3A). Responses on the ‘active’ lever, which resulted in cocaine infusions and the concomitant presentation of a simple light stimulus, exhibited comparable inverted U-shape curves between genotypes, demonstrating that mGluR5KD-D1 mice were able to adapt their responding to the dose of cocaine available. Moreover, when trained to self-administer cocaine (0.75 mg/kg per infusion) for 10 consecutive sessions, both WT and mGluR5KD-D1 mice rapidly acquired and maintained stable responding on the active lever (Fig. 3B). Collectively, these results indicate that the primary reinforcing effects of cocaine are unaffected by knock-down of mGluR5 on D1R-expressing cells.
The ability of the stimulus associated with cocaine infusions to reinstate extinguished cocaine-seeking was then assessed. Following stable responding on the active lever during cocaine self-administration sessions, cocaine-seeking responses were extinguished by withholding further drug infusions and stimulus presentations. During extinction sessions, both genotypes significantly reduced responding on the active lever (Fig. 3C). During the test of cue-induced reinstatement of cocaine-seeking, mGluR5KD-D1 mice made significantly fewer responses than WT mice on the active lever that now resulted in presentation of the previously cocaine-paired stimulus, but not cocaine itself (Fig. 3D). These findings indicate that mGluR5 located on D1R-expressing cells is intimately involved in the reinstatement of cocaine-seeking maintained by a cocaine-paired stimulus.
Associative learning in mGluR5KD-D1 mice
Through associative learning, a stimulus paired with reward (CS) can acquire informative or predictive properties that serve to signal the availability and/or location of the reward (goal-tracking) and can also acquire incentive motivational properties enabling CSs to attract attention (sign-tracking), energize ongoing reward-seeking (Pavlovian-instrumental transfer), and/or directly reinforce instrumental behaviors (conditioned reinforcement) (Rescorla, 1988; Robinson and Flagel, 2009). In principle, any of these neurobiologically distinct learned properties could contribute to the effects of drug-paired stimuli on drug-seeking and relapse (Everitt and Robbins, 2005). The next series of experiments examined the consequence of mGluR5 knock-down on D1R-expressing cells for these different aspects of associative reward-learning processes.
Using Pavlovian conditioning procedures, cohorts of hungry mice were presented with a stimulus associated with food delivery (CS+) and a second stimulus associated with no outcome (CS−) (conditioning data from conditioned reinforcement/ goal-tracking cohort shown in Fig. 4A). There was no genotype difference in the learning of predictive properties of the CS+ that enable it to signal the availability and location of reward, as indicated by an increase across conditioning sessions in the number of head-entries into the food-delivery magazine that occurred following onset of the CS+, but before food delivery (goal-tracking responses; Fig. 4B). mGluR5KD-D1 mice were also able to attribute incentive properties to the CS+ necessary for energizing ongoing reward-seeking, as demonstrated by the ability of noncontingent CS+ presentations to enhance responding on a lever previously associated with food delivery (Pavlovian-instrumental transfer test; Fig. 4D).
However, when a CS+ was presented contingent upon a novel instrumental response, mGluR5KD-D1 mice made significantly fewer responses on the lever that resulted in CS+ presentations than WT mice (conditioned reinforcement test; Fig. 4E). In this test, there were no genotype differences in responses on the lever that resulted in CS− presentations, or the latency to explore either lever (lever, genotype, and lever × genotype interaction, F < 1). The specific impairment in CS+ reinforced lever responding could not be attributed to a general inability of mGluR5KD-D1 mice to acquire an instrumental response, because they readily acquired instrumental responding when it was reinforced by the primary food reward (see food self-administration training data from Pavlovian-instrumental transfer cohort, Fig. 4C). Together, these data indicate a necessary role of mGluR5 on D1R-expressing neurons for incentive learning that enables a CS+ to serve as a conditioned reinforcer.
Finally, the ability of the CS+ to attract behavior was assessed by relocating a discrete light CS+ behind a nose-poke hole and measuring approach responses toward it. mGluR5KD-D1 mice made significantly fewer approaches toward the light CS+ than WT mice and there were no significant genotype differences in responses into the control nose-poke hole (sign-tracking test; Fig. 4F). Thus, in addition to the aforementioned deficit in conditioned reinforcement, mGluR5 knockdown on D1-expressing neurons resulted in a deficit in the attribution of incentive properties to the CS+ necessary for the CS to become highly salient and attractive (Robinson and Berridge, 1993; Tomie et al., 2008).
Using cell type-specific RNA interference, we have generated a novel mouse line in which the metabotropic glutamate receptor, mGluR5, is selectively knocked-down on cells that express dopamine D1 receptors. We identify this mGluR5 population as playing a dissociable role in the primary versus secondary (that is, conditioned) reinforcing effects of cocaine, as revealed by normal cocaine self-administration but impaired cue-induced reinstatement of cocaine-seeking in mGluR5KD-D1 mice. A detailed assessment of reward-learning in these mice reveals specific deficits in learning processes necessary for the attribution of incentive motivational properties to reward-paired stimuli that enable them to directly reinforce behaviors (conditioned reinforcement) and to become highly salient and attractive (sign-tracking). However, other aspects of reward learning were normal in mutant mice, including learning about the predictive properties of reward-paired stimuli which serve to signal the availability and location of reward (goal-tracking) and incentive learning that enables the reward-paired stimulus to energize responding directed toward obtaining a reward (Pavlovian-instrumental transfer). Collectively, our data indicate that mGluR5 located on D1R-expressing neurons play a central role in specific associative reward-learning processes, which are engaged following cocaine experience and thereby enable environmental stimuli associated with cocaine to exert a prolonged and pervasive influence over relapse susceptibility.
To interfere with the expression of mGluR5 selectively in D1R-expressing neurons we used a BAC-based construct in which a conventional RNA-polymerase II promoter (the D1R-promoter) drives the expression of artificial microRNAs and a reporter. A similar approach has been reported previously for interference with other genes in nurse cells (Rao et al., 2006) and, together with a very recent report (Garbett et al., 2010), our findings show that this technique can be used successfully in the brain. Compared with conditional gene deletion this approach has the advantage that it involves only one mouse line and offers the perspective to be used, in modified forms, in other organisms in which targeted mutagenesis is not feasible. Previous use of RNAi-based approaches have raised our awareness that excessive levels of short RNAs may oversaturate exportin 5 and thus block the processing of endogenous short RNAs leading to perturbed cellular homeostasis (Grimm et al., 2006). This is not the case for the mGluR5KD-D1 mice, where maturation of short RNAs is normal. Most likely, previously reported problems were caused by the use of tools resulting in very high levels of short RNAs, such as strong RNA polymerase III promoters or the use of shRNAs instead of artificial microRNAs (Boudreau et al., 2009). Another potential problem is off-target effects. Although we cannot completely exclude interference with the translation of other RNAs, we show that the levels of mRNAs with partial complementarity to the microRNAs are not affected. Collectively this suggests that artificial microRNAs driven by cell type-specific promoters will be a very useful addition to the neuroscience tool-box, greatly reducing the necessary size of transgenic animal colonies.
Using the cue-induced reinstatement model, considered an animal model of relapse vulnerability (Shaham et al., 2003; Sanchis-Segura and Spanagel, 2006; Stephens et al., 2010), our current findings add to previous reports indicating a role of mGluR5 in regulating behavioral responses to cocaine (Chiamulera et al., 2001) and cocaine-paired cues (Bäckström and Hyytiä, 2006) by suggesting a location of mGluR5 necessary for the cue-induced reinstatement of cocaine-seeking, while the primary reinforcing effects of cocaine are unaffected following specific knock-down of mGluR5 on D1R-expressing neurons. Our study also lends mechanistic confidence to previous reports that have used pharmacological tools to identify a role of mGluR5 in behaviors maintained by reward-paired stimuli (Tessari et al., 2004; Bespalov et al., 2005; Bäckström and Hyytiä, 2006; Schroeder et al., 2008; Gass et al., 2009; Kumaresan et al., 2009; Martin-Fardon et al., 2009; O'Connor et al., 2010), since these reports could have reflected off-target (Olive, 2009), anhedonic (Bäckström and Hyytiä, 2007) or reinforcing (van der Kam et al., 2009) effects of the pharmacological tools used.
Associative reward-learning, that attributes drug-paired stimuli with properties necessary for triggering relapse-like behaviors, is not a unitary process but can be dissociated psychologically, neurobiologically (Everitt et al., 2001) and genetically (Mead and Stephens, 2003a,b). Thus, to determine precisely which reward-learning processes were disrupted in mutant mice, we used Pavlovian conditioning procedures in which a stimulus was paired with the delivery of food [that is, the unconditioned stimulus (US)]. A potential limitation of this approach is that the extent to which neural circuitries that mediate associative learning for natural reinforcers (such as food) overlap with those engaged by drug reinforcers is not fully understood. However, attempts to employ purely Pavlovian conditioning procedures using a “drug US” have been hampered by the negative behavioral effects associated with nonresponse contingent drug delivery (Dworkin et al., 1995; Mitchell et al., 1996; Arroyo et al., 1998). Nevertheless, our findings that cocaine-seeking and specific incentive learning processes were both impaired in mutant mice provide empirical support for multiple contemporary theories of drug addiction, which propose that the ability of drug-paired stimuli to influence drug-seeking and relapse reflect the interactions of addictive drugs with neural systems that normally subserve associative reward-learning processes for natural reinforcers (Stewart et al., 1984; Tiffany, 1990; Robinson and Berridge, 1993; Everitt et al., 2001; Stephens and Duka, 2008; Thomas et al., 2008).
An advantage of the behavioral models used in the present study is that the underlying neural circuitry is relatively well characterized. The nucleus accumbens is crucial for learning necessary for conditioned reinforcement (Parkinson et al., 1999; Ito et al., 2004), the development of sign-tracking responses (Parkinson et al., 2000; Di Ciano et al., 2001) and reinstatement of cocaine-seeking (Fuchs et al., 2004). This strongly suggests that the deficits in associative learning and reinstatement of cocaine-seeking observed in the mGluR5KD-D1 mice are due to the lack of mGluR5 in D1-MSNs in the nucleus accumbens. Moreover, the continued expression of mGluR5 on non-D1 MSNs [that is, D2-MSNs, except the minority expressing both D2R and D1R (Valjent et al., 2009)] was insufficient to support specific incentive learning processes and relapse-like behaviors in mutant mice. Although we cannot formally rule out the contribution of mGluR5 on MSNs in the dorsal striatum or in other D1R-expressing cells, such as those in the hippocampus or cortex, a major contribution from mGluR5 in the latter structures seems unlikely since we saw no significant reduction of mGluR5 in the cortex or hippocampus of mGluR5KD-D1 mice. These observations may suggest that the D1R-promoter is less strong in these regions or that D1 and mGluR5 are not expressed in the same neuronal populations.
Recent reports have highlighted that stimulation of striatal D1R and NMDA receptors, and the resultant activation of extracellular signal-related kinase (ERK) specifically in D1 MSNs, represent critical mechanisms through which the long-term effects of addictive drugs are mediated (Heusner and Palmiter, 2005; Valjent et al., 2005; Bertran-Gonzalez et al., 2008). Moreover, both D1R and NMDA receptors in the accumbens appear critical for the early consolidation of appetitive Pavlovian memories (Dalley et al., 2005). These reports are particularly relevant in the context of our current findings, given the close interactions between mGluR5 and D1R (Paolillo et al., 1998; Voulalas et al., 2005; Schotanus and Chergui, 2008) and NMDA receptors (Pisani et al., 2001; Mao and Wang, 2002; Choe et al., 2006) in the striatum. Thus, it is possible that impaired incentive learning and relapse-like behaviors in mGluR5KD-D1 mice were due, in part, to changes in striatal D1R and NMDA receptor function as a consequence of mGluR5 loss. A future challenge will be to further understand the complex interplay of glutamate and dopamine signaling within striatal circuits, and determine precisely which cellular mechanisms encode appetitive memories and mediate subsequent behavioral responses to environmental stimuli associated with natural and drug reinforcers.
In summary, our present findings, together with a recent report from our laboratory (O'Connor et al., 2010), suggest that mGluR5-mediated neuroplastic events on D1-MSNs are crucial for the formation of psychologically distinct associations between environmental stimuli and rewards that endow reward-paired stimuli with the subsequent ability to both reinforce and attract motivated behaviors. Furthermore, recent reports have revealed that mGluR5-mediated striatal plasticity is involved in, or affected by, cocaine experience (Fourgeaud et al., 2004; Moussawi et al., 2009). Our report provides a psychobiological context for these findings by pointing to glutamate signaling at mGluR5 on striatal D1-MSNs as a key mediator through which repeated cocaine experience (and presumably exposure to other drugs of abuse) produces a persistent increase in the susceptibility to relapse triggered by environmental stimuli associated with drug use.
This work was supported by the Deutsche Forschungsgemeinschaft through Collaborative Research Centers SFB 488 and SFB 636, by the Fonds der Chemischen Industrie, the European Union through Grant LSHM-CT-2005-018652 (CRESCENDO), the Bundesministerium für Bildung und Forschung through NGFNplus Grants FZK 01GS08153 and 01GS08142, the Helmholtz Gemeinschaft Deutscher Forschungszentren through Initiative CoReNe and Alliance HelMA, and the Deutsche Krebshilfe through project 108567. D.E. was supported by the Swedish Research Council, the Thuring, Wiberg and Jeansson foundations and the Swedish Society of Medicine. E.C.O. receives a studentship from the Biotechnology and Biological Sciences Research Council and Pfizer Inc. Research in D.N.S.'s laboratory is supported by the United Kingdom Medical Research Council. H.S.C. was supported by a Marie Curie reintegration award. We thank Ali Nasr Esfahani for help with immunohistochemistry, Milen Kirilov and Daniel Habermehl for input on short RNAs, and Witold Konopka for valuable suggestions.
- Correspondence should be addressed to either of the following: David Engblom, Department of Clinical and Experimental Medicine, Linköping University, 58185, Linköping, Sweden, ; David N. Stephens, School of Psychology, University of Sussex, Brighton BN1 9QG, UK, E-mail: ; or Rainer Spanagel, Department of Psychopharmacology, Central Institute of Mental Health, J5, 68159 Mannheim, Germany, E-mail: