Abstract
In operant learning, initial reward-associated memories are thought to be distinct from subsequent extinction-associated memories. Memories formed during operant learning are thought to be stored in “neuronal ensembles.” Thus, we hypothesize that different neuronal ensembles encode reward- and extinction-associated memories. Here, we examined prefrontal cortex neuronal ensembles involved in the recall of reward and extinction memories of food self-administration. We first trained rats to lever press for palatable food pellets for 7 d (1 h/d) and then exposed them to 0, 2, or 7 daily extinction sessions in which lever presses were not reinforced. Twenty-four hours after the last training or extinction session, we exposed the rats to either a short 15 min extinction test session or left them in their homecage (a control condition). We found maximal Fos (a neuronal activity marker) immunoreactivity in the ventral medial prefrontal cortex of rats that previously received 2 extinction sessions, suggesting that neuronal ensembles in this area encode extinction memories. We then used the Daun02 inactivation procedure to selectively disrupt ventral medial prefrontal cortex neuronal ensembles that were activated during the 15 min extinction session following 0 (no extinction) or 2 prior extinction sessions to determine the effects of inactivating the putative food reward and extinction ensembles, respectively, on subsequent nonreinforced food seeking 2 d later. Inactivation of the food reward ensembles decreased food seeking, whereas inactivation of the extinction ensembles increased food seeking. Our results indicate that distinct neuronal ensembles encoding operant reward and extinction memories intermingle within the same cortical area.
SIGNIFICANCE STATEMENT A current popular hypothesis is that neuronal ensembles in different prefrontal cortex areas control reward-associated versus extinction-associated memories: the dorsal medial prefrontal cortex (mPFC) promotes reward seeking, whereas the ventral mPFC inhibits reward seeking. In this paper, we use the Daun02 chemogenetic inactivation procedure to demonstrate that Fos-expressing neuronal ensembles mediating both food reward and extinction memories intermingle within the same ventral mPFC area.
Introduction
Operant learning involves learned associations between specific behaviors and unconditioned rewards (Skinner, 1953). Rats can learn to perform an operant response (lever press) to receive rewarding food pellets (Catania, 1992) or drug infusions (Schuster and Thompson, 1969). If food or drug reward delivery is terminated, the learned response undergoes operant extinction (Skinner, 1953; McNally, 2014; Todd et al., 2014). These reward-associated and extinction-associated memories in operant learning are thought to be distinct from each other (Bouton and Swartzentruber, 1991; Millan et al., 2011). Because memories are thought to be stored within specific patterns of sparsely distributed neurons called “neuronal ensembles” (Hebb, 1949; Pennartz et al., 1994; Guzowski et al., 2001; Cruz et al., 2013), reward-associated memories and extinction-associated memories in operant learning are also believed to be encoded by distinct neuronal ensembles (Cruz et al., 2015).
A current popular hypothesis is that neuronal activity in dorsal medial prefrontal cortex (dmPFC) promotes reward seeking whereas activity in ventral mPFC (vmPFC) inhibits reward seeking (Peters et al., 2009). Support for this hypothesis is that pharmacological inhibition of dmPFC decreases reinstatement of drug and food seeking after extinction (McFarland and Kalivas, 2001; Kalivas and McFarland, 2003; Calu et al., 2013), whereas inhibition of vmPFC reinstates drug and food seeking after extinction (Rhodes and Killcross, 2004, 2007; Ishikawa et al., 2008; Peters et al., 2008a,b). Additionally, immediate early gene expression (a neuronal activity marker) is increased in dmPFC after operant learning and in vmPFC after extinction learning (Hamlin et al., 2007; Marchant et al., 2010; Nic Dhonnchadha et al., 2012). However, the notion of dichotomous control over reward seeking by vmPFC versus dmPFC is not supported by studies showing that vmPFC inhibition decreases methamphetamine and heroin seeking after extinction or cocaine seeking after abstinence (Rogers et al., 2008; Koya et al., 2009a; Rocha and Kalivas, 2010). Additionally, dmPFC inhibition has no effect on reinstatement of heroin seeking after extinction or cocaine or methamphetamine seeking after abstinence (Koya et al., 2009a; Bossert et al., 2011; Li et al., 2015a).
From the perspective of the role of neuronal ensembles in learned behaviors, a limitation of the studies described above is that site-specific pharmacological manipulations (or lesions) interfere with neuronal activity in an entire brain region rather than selectively targeting sparsely distributed learning-related activated neurons (the neuronal ensembles) (Cruz et al., 2015). To address this issue, we and others have recently developed procedures to selectively inhibit or activate neuronal ensembles (Garner et al., 2012; Cruz et al., 2013). One of these procedures is the Daun02 inactivation procedure that uses Fos-lacZ transgenic rats that express β-galactosidase (β-Gal) under the control of the Fos promoter so that Fos and β-Gal proteins are coexpressed only in strongly activated neurons (Koya et al., 2009b). Intracranial injections of the inactive prodrug Daun02 can then inactivate recent behaviorally activated neurons because the β-Gal induced within these neurons rapidly converts Daun02 into daunorubicin, which disrupts and kills β-Gal-expressing neurons (Fanous et al., 2012; Cruz et al., 2014; Pfarr et al., 2015; Engeln et al., 2016).
Using the Daun02 procedure, we showed that selective inactivation of a minority of vmPFC neurons (∼6%) previously activated by exposure to heroin-associated contexts reduces subsequent reinstatement of drug seeking (Bossert et al., 2011). The finding that inactivating a small number of the most active neurons in the vmPFC (the neuronal ensemble) decreases reinstatement drug seeking appears inconsistent with the findings that inhibiting the entire vmPFC reinstates drug or food seeking after extinction (Butter et al., 1963; Rhodes and Killcross, 2007; Peters et al., 2008a; LaLumiere et al., 2010). One idea that may account for these discrepant results is that distinct vmPFC neuronal ensembles (identified by Fos) encode both operant reward and operant extinction memories. Here, we used the Daun02 procedure to test this idea. We also used double-labeling in situ hybridization to determine the cell types of the Fos-expressing neurons.
Materials and Methods
Subjects.
We used a total of 189 male Long–Evans rats (Charles River) and Fos-lacZ transgenic rats (Koya et al., 2009b), each weighing between 250 and 350 g at the start of experiments. After surgery, we housed rats individually under a reverse 12 h light/dark cycle (lights off at 8:00 A.M.). Water was freely available in the rats' home cages throughout the experiment. Food was restricted to 20 g per day of Purina rat chow (given after the daily operant sessions). All procedures followed the guidelines outlined in the Guide for the care and use of laboratory animals (Ed 8; http://grants.nih.gov/grants/olaw/Guide-for-the-Care-and-Use-of-Laboratory-Animals.pdf). From all experiments, we excluded 6 of the rats for misplaced cannulas (Experiments 3 and 4, more rostral than 3.4 mm bregma or more caudal than 2.3 mm bregma).
Intracranial surgery.
We anesthetized rats with ketamine and xylazine (80 and 20 mg/kg, i.p.), and implanted permanent guide cannulas (23 gauge, Plastics One) bilaterally 1 mm above the vmPFC. The nose bar was set at −3.3 mm, and the coordinates for the vmPFC were anteroposterior 3.0, mediolateral ±1.5, and dorsoventral −4.3 (10° angle). We fixed cannulas to the rat's skull with dental cement. We administered buprenorphine (0.1 mg/kg, s.c.) after surgery to relieve pain, and allowed rats to recover for 5–7 d before operant training.
Intracranial injections.
In Experiment 2, we injected the GABAa (muscimol, 0.03 nmol/0.5 μl/side) + GABAb agonists (baclofen, 0.3 nmol/0.5 μl/side) into vmPFC. We obtained the drugs from Tocris Bioscience and dissolved them in sterile saline; the muscimol+bacolden concentration is based on previous studies (McFarland and Kalivas, 2001; Bossert et al., 2011). In Experiment 3, we injected Daun02 or vehicle into the vmPFC. We obtained Daun02 from Sequoia Research Products (www.seqchem.com). We dissolved Daun02 (2 μg/0.5 μl/side) in vehicle solution containing 5% DMSO, 6% Tween 80, and 89% 0.1 m PBS. We chose the dose of Daun02 based on our previous studies (Koya et al., 2009b; Bossert et al., 2011; Cruz et al., 2014). We injected muscimol+baclofen, Daun02, or vehicle using a syringe pump (Harvard Apparatus) and 10 μl Hamilton syringes that were attached via polyethylene-50 tubing to 30 gauge injectors (Plastics One) that extended 1 mm beyond the guide cannula. We injected muscimol+baclofen, Daun02 or vehicle over 1 min, and left the injectors in place for 1 min before removal.
Apparatus.
We trained and tested rats in Med Associates self-administration chambers, each equipped with two retractable levers located 7 cm above the grid floor and a red house light. Presses on the active retractable lever activated the pellet dispenser, whereas presses on the inactive retractable lever had no programmed consequences.
Operant conditioning and extinction of lever pressing for palatable food pellets.
Each experiment consisted of three phases: operant food self-administration (7 d), extinction training (0, 2, 7 d), and tests for food self-administration or extinction recall (1 d). During the self-administration phase, we trained the rats to lever press to receive palatable food pellets (45 mg, TestDiet, catalog #1811155) for 1 h/d for 7 d. Rats earned pellets on a fixed ratio 1 with 20 s timeout reinforcement schedule (formally a fixed interval 20 reinforcement schedule). During the extinction phase, responses on the previously active lever had no programmed consequences during each 1 h/d session. During the No Levers period between the operant training and extinction training phases, we placed rats in the operant conditioning chambers for 1 h/d without the retractable levers. We chose the No Levers condition to ensure that the rats in all training conditions received similar handling and experience with the training/extinction context. This condition allowed us to isolate reinforced or nonreinforced lever pressing as the manipulated variable. Based on previous studies (Lu et al., 2005; Bouton et al., 2011), we did not expect an effect of context exposure on subsequent lever pressing under extinction conditions. Finally, we conducted tests of self-administration or extinction recall under extinction conditions for 15 min.
Experiment 1: effect of recall of food self-administration memories and extinction memories on Fos expression in vmPFC and dmPFC.
The goal of Experiment 1 was to determine whether exposure to cues previously associated with food self-administration training and extinction training would cause a different pattern of Fos expression in vmPFC and dmPFC. For this purpose, we exposed the rats in the test group to a short 15 min test session in which they had access to the lever previously associated with the food pellets, but lever presses were not reinforced. We reasoned that, for rats in the no prior-extinction group (0 sessions), the predominant memories that will be recalled during the 15 min test session would be the food reward memories, whereas for rats from the other 2 groups (2 or 7 extinction sessions), the predominant memories that will be recalled would be the extinction memories. We left rats in the no test group in their homecages on test day.
We assessed Fos expression in the dmPFC and vmPFC in n = 36 rats using a 3 × 2 between-subjects factorial experimental design with Prior extinction sessions (0, 2, 7) × Test (no test, test). The experimental timeline is shown in Figure 1B. We trained rats for 7 d to lever press for palatable food pellets as described above. We divided these rats into 3 groups that received 0, 2, or 7 prior extinction sessions with all groups previously matched based on lever pressing from the last day of food self-administration training. We exposed rats in the 0 d extinction group to 7 daily No Levers sessions. Rats in the 2 d extinction group underwent 5 daily No Levers sessions, followed by 2 daily 1 h extinction training sessions. Rats in the 7 d extinction group underwent 7 daily 1 h extinction training sessions. The use of the variable No Levers sessions was to ensure that each group of rats was time-matched at the time of testing and had equal exposure to the chambers. On test day, 24 h after the last extinction or No Levers session, we divided each group of rats again into 2 groups that we placed into the test chamber under extinction conditions for 15 min (test) or left in their home cages (no test).
Ninety minutes after the start of the test session, we deeply anesthetized rats with isoflurane for 90 s and perfused them with 100 ml of PBS, followed by 400 ml of 4% PFA in 0.1 m PBS. We postfixed the brains for an additional 90 min in PFA and incubated them in 30% sucrose in PBS at 4°C for 2–3 d. We froze brains in powdered dry ice and kept them at −80°C until sectioning. We washed coronal brain sections (40 μm) in PBS, blocked with 3% NGS in PBS with 0.25% Triton X-100 (PBS-Tx), and incubated 24 h at 4°C with anti-Fos antibody (1:4000 dilution, BA-1000, sc-52; Santa Cruz Biotechnology) in blocking solution. We then washed sections in PBS and incubated them with biotinylated goat anti-rabbit secondary antibody (1:600 dilution; Vector Laboratories) in PBS-Tx and 1% NGS for 2 h. After washing in PBS, we incubated sections for 1 h in avidin-biotin-peroxidase complex (ABC Elite kit, PK-6100; Vector Laboratories) in PBS containing 0.5% Triton X-100. Finally, we washed sections in PBS and developed in DAB for ∼3 min, washed with PBS, and mounted onto chromalum-gelatin-coated slides. Once dry, we dehydrated the slides through a graded series of alcohol (30%, 60%, 90%, 100%, 100% ethanol) and cleared them with Citrasolv (Fisher Scientific) before coverslipping with Permount (Sigma). We digitally captured bright-field images of immunoreactive (IR) cells in vmPFC and dmPFC using an EXi Aqua camera (QImaging) attached to a Zeiss Axioskop 2 microscope at 200× magnification (Carl Zeiss Microscopy) and iVision software for Macintosh, version 4.0.15 (Biovision). Observers blind to the test conditions (Pearson's r correlation = 0.91 for inter-rater reliability) automatically counted labeled nuclei from two sections (bilateral) per rat (3 images per rat). We averaged the counts so that each rat was an n of 1 for each brain area.
We repeated this experiment in a separate set of rats (n = 36) to determine the cellular phenotype of the Fos-expressing neurons using the same 3 × 2 between-subjects factorial experimental design with Extinction sessions (0, 2, 7) × Test (home cage, test). Thirty minutes after the beginning of the test session, we briefly anesthetized the rats with isoflurane (>30 s) and decapitated them. We rapidly extracted and froze their brains for 20 s in −40°C isopentane. We stored brains at −80°C until use. We collected coronal brain sections (16 μm) directly onto Super Frost Plus slides (Fisher Scientific) and stored the slides at −80°C until use. For RNA in situ hybridization, we used RNAscope Multiplex Fluorescent Reagent Kit (Advanced Cell Diagnostics) according to the manufacturer's instructions and as described previously (Li et al., 2015b; Rubio et al., 2015). Briefly, we fixed sections in 10% formalin at 4°C for 20 min, rinsed in PBS, and dehydrated in increasing concentrations of ethanol (50%, 70%, 100%, 100%). We stored slides in 100% ethanol overnight. The next day, we dried the slides at room temperature and drew a hydrophobic barrier around the section. We then treated slides with a protease (Pretreatment 4) for 20 min and washed in distilled water. We then applied target probes for Fos, Slc17a7 (Vglut1, a marker of pyramidal glutamatergic projection neurons), and Slc32a1 (Vgat, a marker of GABAergic interneurons) designed by Advanced Cell Diagnostics.
We then incubated slides with a series of preamplifier and amplifier probes at 40°C (AMP1 for 30 min, AMP2 for 15 min, AMP3 for 30 min). Next, we incubated slides with fluorescently labeled probes (Alexa-488, -550, and -647). Finally, we incubated the slides with DAPI and coverslipped them with Vectashield fluorescent mounting medium (Vector Laboratories). We captured fluorescent images of labeled cells in vmPFC using an EXi Aqua camera (QImaging) attached to a Zeiss AxioImager M.2 microscope at 200× magnification (Carl Zeiss Microscopy) and iVision software for Macintosh, version 4.0.15 (Biovision). We quantified mRNA colabeling from two hemi-sections using ImageJ (2 images per rat) in a blind manner. One rat was excluded from the analysis because of poor labeling for Vglut1 and Vgat.
Experiment 2: effect of pharmacological inactivation of vmPFC on recall of food self-administration or extinction memories.
Based on the results of Experiment 1, increased Fos expression in vmPFC after 2 extinction sessions, we used muscimol+baclofen to inactivate vmPFC before a test of self-administration or extinction recall (Koya et al., 2009b; Bossert et al., 2011) to assess a causal role for this brain area in the recall of self-administration and extinction memories.
We examined rats (n = 24) using a 2 × 2 between-subjects factorial experimental design with Prior extinction sessions (0, 2 d) × Injection (vehicle, muscimol+baclofen). The experimental timeline is shown in Figure 4A. We anesthetized the rats and implanted permanent guide cannulas bilaterally into their vmPFC as described above. After a minimum of 7 recovery days, we trained the rats using the same procedure described in Experiment 1. After training the rats to lever press for food for 7 d, we divided them into 2 groups that received 0 or 2 prior extinction sessions with both groups previously matched based on lever pressing from the last day of self-administration training. Rats then underwent either 5 d of No Levers, followed by 2 extinction sessions (extinction recall group), or 7 d of No Levers (self-administration recall group). On test day, 24 h later, we injected muscimol+baclofen or vehicle into vmPFC, as described above. We placed rats into the test chamber to habituate without access to the levers for 10 min. Following the 10 min habituation, the levers extended (but lever presses were not reinforced) for 15 min. We then deeply anesthetized the rats with isoflurane and perfused them with PBS and 4% PFA. We removed their brains to check cannula placement.
Experiment 3: effect of Daun02 inactivation of neurons in vmPFC activated during recall of food self-administration or extinction memories.
Based on the results of Experiment 1, maximal Fos expression in vmPFC after 2 extinction sessions, we used the Daun02 inactivation procedure (Koya et al., 2009b; Bossert et al., 2011; Cruz et al., 2014) to determine whether distinct Fos-expressing neuronal ensembles in vmPFC play a causal role in the recall of extinction memories. The experimental procedure was similar to that of Experiment 1, with the exception that we only compared groups of rats that previously experienced 0 or 2 extinction sessions and added a short 15 min “induction” session, identical to the subsequent test session, to recall the putative food reward memories and the extinction memories. We hypothesized that extinction memories are recalled during induction day in rats that previously experienced 2 d of extinction training. Therefore, we predicted that postsession injections of Daun02 would inactivate ensembles encoding extinction memories and impair extinction recall in the subsequent test session 2 d later, resulting in increased food seeking. Conversely, to the degree that Daun02 inactivation on induction day would inactivate food reward memories in the 0 extinction sessions group, this would lead to decreased food seeking on the subsequent test for food seeking 2 d later.
We examined Fos-lacZ rats (n = 46) using 2 × 2 a between-subjects factorial experimental design with Prior extinction sessions (0, 2 d) × Injection (vehicle, Daun02). The experimental timeline is shown in Figure 5A. We anesthetized Fos-lacZ rats (previously bred for 45–50 generations on a Sprague Dawley background) and implanted permanent guide cannulas bilaterally into their vmPFC as described above. After a minimum of 7 recovery days, we trained rats using the same procedure described in Experiment 1. After training the rats to lever press for food for 7 d, we divided them into 2 groups that received 0 or 2 prior extinction sessions with both groups previously matched based on lever pressing from the last day of self-administration training. Rats then underwent either 5 d of No Levers, followed by 2 extinction sessions (extinction recall group), or 7 d of No Levers (self-administration recall group). On induction day, 24 h later, we placed the rats into the test chamber under extinction conditions for 15 min to induce β-Gal protein expression. We then returned the rats to their home cages, and 90 min after the start of the induction session, injected vehicle or Daun02 into the vmPFC as described above. Rats underwent two additional days of 1 h/d No Levers sessions following induction day. We chose the No Levers condition to remain consistent with the 2 d before testing in our previous Fos expression experiments. On test day, 24 h later, we gave rats in the self-administration and extinction recall groups a brief 15 min test under extinction conditions and returned them to their home cages. After 75 min, we deeply anesthetized the rats with isoflurane and perfused them with PBS and 4% PFA. We removed their brains and processed them for Fos immunohistochemistry as described above.
Experiment 4: effect of Daun02 inactivation of neurons in vmPFC selectively activated by recall of food self-administration memories: a replication.
The goal of Experiment 4 was to confirm that our induction procedure for the rats previously exposed to 0 extinction sessions indeed activates the memories of food reward. At issue is that in Experiment 3 (and Experiment 1) the memory recall for the food was performed during a short 15 min extinction session. Thus, it is possible that neuronal activation also reflects initial extinction learning. To address this potential confound, in Experiment 4 we directly compared the effect of Daun02 injections on induction day in 2 groups of rats in which lever presses did or did not result in food delivery. To the degree that under these conditions the food reward memory is recalled, we would expect a similar behavioral effect (inhibition) during a test for food seeking 2 d later.
We assessed whether the same Fos-expressing neuronal ensembles are activated when rats are lever pressing for food reward and when rats are lever pressing under extinction conditions. We examined Fos-lacZ (n = 47) rats using a 2 × 2 between-subjects factorial experimental design with Induction day (food, no food) × Injection (vehicle, Daun02). We anesthetized Fos-lacZ rats and implanted permanent guide cannulas into vmPFC as described above. The experimental timeline is shown in Figure 6A. After training the rats to lever press for food for 7 d, we divided them into 2 groups (food, no food) that we matched based on lever pressing from the last day of self-administration training. For the Food group, 24 h after the last training session, we exposed the rats to a brief 15 min training session where rats earned food pellets. After 75 additional minutes, we injected Daun02 or vehicle into vmPFC. For the No food groups, 24 h after the last training session, we exposed the rats to 7 d of No Levers sessions before a brief 15 min induction session. After 75 additional minutes, we injected Daun02 or vehicle into the vmPFC and then placed the rats in their home cages. For all groups, rats underwent two additional days of 1 h/d No Levers sessions following induction day. On test day, 24 h later, we gave all groups a brief 15 min test, under extinction conditions. After 75 additional minutes, we deeply anesthetized the rats with isoflurane and perfused them with PBS and 4% PFA. We removed the brains and processed them for Fos immunohistochemistry as described above.
Statistical analysis.
We analyzed the behavioral and immunohistochemical data by one-way and two-way ANOVAs using Prism (Graphpad Software). Fisher's PLSD was used for post hoc analyses when prior ANOVAs indicated significant main or interaction effects (p < 0.05).
Results
Experiment 1: effect of recall of food self-administration memories and extinction memories on Fos expression in vmPFC and dmPFC
We hypothesized that the food self-administration memory or food reward memory was recalled following 0 prior extinction sessions, whereas the “extinction” memory was recalled following 2 and 7 prior extinction sessions. This hypothetical framework is illustrated in Figure 1A at different points in our experimental procedure shown in Figure 1B. We predicted that this differential memory recall would lead to differential Fos expression patterns in vmPFC and dmPFC.
Food self-administration and extinction of food seeking, and extinction-induced Fos expression in vmPFC. A, Hypothetical formation of self-administration and extinction memories and the predicted behavioral responses. B, Experimental timeline for extinction-induced Fos expression. We trained rats to self-administer food for 7 d and then divided them into three groups with varying numbers of extinction sessions: 0, 2, or 7 sessions. “No levers” indicates the days when we placed rats in the self-administration boxes without access to levers. We assessed food seeking under extinction conditions on test day, 24 h after the last Extinction or No lever session. C, Data are presented as mean ± SEM number of active lever presses, pellets earned, and inactive lever presses over 7 d of self-administration and extinction training. D, Mean ± SEM number of lever presses on both active and inactive lever presses over a 15 min test. *p < 0.05, different from 0 Extinction session group. (n = 6 per group).
Figure 1C shows the mean ± SEM number of pellets earned and active and inactive lever presses during the self-administration and extinction phases. Rats rapidly learned to lever press for pellets during the self-administration phase (p < 0.01) and rapidly decreased lever pressing when pellet rewards were removed during the extinction phase (p < 0.01). We found no significant differences in active lever presses on the first and second days of extinction between rats receiving 2 or 7 d of extinction session (p > 0.05). These were the “prior extinction sessions” before test day.
On test day, lever pressing was assessed for 15 min under extinction conditions (Fig. 1D). Two-way ANOVA indicated significant main effects of Prior extinction sessions (F(2,30) = 5.5, p < 0.01) and Lever (active, inactive) (F(1,30) = 12.6, p < 0.01), but no significant interaction between the two factors (p > 0.1). Post hoc analysis showed that rats that received 2 or 7 extinction sessions pressed the active lever less than rats that received 0 extinction sessions (p < 0.05).
We then analyzed Fos expression in vmPFC and dmPFC following the extinction sessions; sample images from vmPFC are shown in Figure 2A. For vmPFC, two-way ANOVA indicated significant main effects of Prior extinction sessions (F(2,30) = 6.3, p < 0.01) and Test day exposure (no test, test) (F(1,30) = 56.7, p < 0.01), and a significant interaction between the two factors (F(2,30) = 8.1, p < 0.01). Post hoc analysis showed that 2, but not 7, prior extinction sessions produced higher levels of Fos expression than 0 prior extinction sessions (p < 0.05; Fig. 2B). For dmPFC, two-way ANOVA indicated significant main effect of Test day exposure (F(1,30) = 25.2, p < 0.01), but no significant effect of Prior extinction sessions and no interaction between the two factors (p < 0.05). Total numbers of Fos-IR nuclei in the vmPFC and dmPFC are presented in Table 1.
Fos expression in vmPFC after testing. A, Representative images of Fos-IR in vmPFC captured at 200× magnification. B, Mean ± SEM number of Fos-IR nuclei per mm2 in vmPFC. C, Representative images of merged Fos+NeuN double-labeled nuclei in yellow. Red-labeled nuclei indicate expression of the general neuronal marker NeuN. Green-labeled nuclei indicate Fos expression. Scale bar, 100 μm. D, Mean ± SEM percentage of Fos+NeuN double-labeled nuclei in vmPFC. *p < 0.05, different from 0 and 7 Extinction session groups (n = 5 or 6 per group).
Fos immunostaining in mPFC subregions after testing in Experiment 1a
To determine the proportion of Fos-expressing neurons in the vmPFC, we double-labeled additional sections from the same rats for Fos and the general neuronal marker protein NeuN; sample images from vmPFC are shown in Figure 2C. Approximately 4% of NeuN-labeled neurons in the 2 prior extinction sessions group colabeled with Fos (Fig. 2D). Two-way ANOVA indicated significant main effects of Test day exposure (F(1,30) = 41.2, p < 0.01), Prior extinction sessions (F(2,30) = 5.3, p < 0.05), and a significant interaction between the two factors (F(2,30) = 7.7, p < 0.01). Consistent with the previous total Fos expression data, post hoc analysis indicated that 2 prior extinction sessions significantly increased the percentage of NeuN-labeled cells that colabeled for Fos (p < 0.05). Total numbers of Fos- and NeuN-labeled cells are presented in Table 2.
Numbers of vmPFC Fos- and NeuN-positive cells after testing in Experiment 1a
To determine the phenotype of Fos-expressing neurons in the vmPFC, we used in situ hybridization to label Fos, Slc17a7 (Vglut1), and Slc32a1 (Vgat) mRNA in a separate set of rats trained identically to the first experiment. Vglut1 is a marker of glutamatergic pyramidal neurons in PFC (Bellocchio et al., 2000; Geisler et al., 2007), and Vgat is a marker of GABAergic interneurons (McIntire et al., 1997; Henny and Jones, 2008). Figure 3A shows the mean ± SEM number of pellets earned and active and inactive lever presses during the self-administration and extinction phases for rats used for in situ analysis.
Characterization of Fos-expressing cells in vmPFC after testing. A, Data are presented as mean ± SEM number of active lever presses, pellets earned, and inactive lever presses over 7 d of self-administration and extinction training. B, Mean ± SEM number of lever presses on both active and inactive lever presses during the 15 min test session. C, Mean ± SEM number of Fos-labeled nuclei per mm2 in vmPFC. D, Mean ± SEM percentage of Fos+Vglut1-labeled cells per mm2 in vmPFC. E, Mean ± SEM percentage of Fos+Vgat-labeled cells per mm2 in vmPFC. F, Representative images of merged in situ hybridization for Fos and DAPI, Fos and Slc17a7 (Vglut1), and Fos and Slc32a1 (Vgat). Arrows indicate double-labeled cells. Scale bar, 100 μm. *p < 0.05, different from 0 Extinction session group (n = 5 or 6 per group).
Rats rapidly learned to lever press for pellets during the self-administration phase (p < 0.01) and rapidly decreased their lever pressing when pellet rewards were removed during the extinction phase (p < 0.01). We found no significant differences in active lever presses on the first and second days of extinction between rats receiving 2 or 7 d of extinction session (p > 0.05). On test day, lever pressing was assessed for 15 min under extinction conditions (Fig. 3B). Two-way ANOVA indicated significant main effects of Prior extinction sessions (F(2,30) = 68.6, p < 0.01) and Lever (active, inactive) (F(1,30) = 40.2, p < 0.01), and a significant interaction between the two factors (F(1,30) = 31.6, p < 0.01). Post hoc analysis showed that rats that received 2 or 7 extinction sessions pressed the active lever less than rats that received 0 extinction sessions (p < 0.05).
Figure 3C shows that Fos mRNA expression patterns mirrored those of protein expression. Two-way ANOVA indicated significant main effects of Prior extinction sessions (F(2,29) = 5.2, p < 0.05) and Test day exposure (F(1,29) = 77.0, p < 0.01), and a significant interaction between the two factors (F(2,29) = 3.6, p < 0.05). Post hoc analysis indicated that 2, but not 7, prior extinction sessions produced significantly higher levels of Fos mRNA expression than 0 prior extinction sessions (p < 0.05). Approximately 90% of Fos-expressing neurons colabeled with Vglut1 (Fig. 3D), and <10% colabeled with Vgat (Fig. 3E). Sample images from vmPFC are shown in Figure 3F. We found no evidence for cell-type specificity of Fos activity following self-administration or extinction recall.
Together, the results of Experiment 1 suggest that the recall of food memories and extinction memories following different durations of extinction training (2 or 7 sessions) leads to selective activation of vmPFC but not dmPFC after short (2 sessions) but not prolonged (7 sessions) extinction training.
Experiment 2: effect of pharmacological inactivation of vmPFC
Based on the results of Experiment 1, we tested whether globally inactivating vmPFC would influence food seeking. We used infusions of the GABA-A and GABA-B agonists muscimol and baclofen to inactivate all neurons within the vmPFC (Koya et al., 2009b; Bossert et al., 2011) to determine whether this brain area plays a causal role in the recall of self-administration and extinction memories.
Figure 4C shows the mean ± SEM number of pellets earned and active and inactive lever presses during the self-administration and extinction training phases. Rats rapidly learned to lever press for pellets during the self-administration phase and rapidly decreased lever pressing when pellet rewards were removed during the extinction phase (p values <0.01). Figure 4D indicates no differences between groups for lever pressing on the last day of self-administration before testing with muscimol+baclofen.
Pharmacological inactivation of vmPFC following 0 and 2 extinction sessions. A, Experimental timeline for the inactivation experiment. We trained rats to self-administer food for 7 d and then divided them into 2 groups that underwent 0 or 2 extinction sessions. On test day, we bilaterally injected vehicle or the GABA-A and GABA-B agonists muscimol+baclofen into the vmPFC 10 min before a 15 min test session under extinction conditions. B, Active lever presses, pellets earned, and inactive lever presses over 7 d of food self-administration and extinction training. Data are mean ± SEM. C, Figure showing placement of cannulas. D, Mean ± SEM number of active lever presses on the last day of self-administration indicates no differences before testing. E, Mean ± SEM number of active lever presses over a 15 min test following vehicle or muscimol+baclofen injections. (n = 6 per group).
On test day, lever pressing was assessed for 15 min under extinction conditions (Fig. 4E). Two-way ANOVA indicated a significant main effect of Prior extinction sessions (F(1,20) = 19.0, p < 0.01) but no significant effects of muscimol+baclofen or an interaction between the two factors (p values >0.05). Post hoc analysis showed the rats that received 2 prior extinction sessions pressed the active lever less than rats that received 0 extinction sessions (p < 0.05).
Experiment 3: effect of Daun02 inactivation of neurons in vmPFC activated during recall of food self-administration or extinction memories
We hypothesized that selective inactivation of only the neuronal ensembles that were activated during recall of self-administration and extinction may produce a different effect than global inactivation of all vmPFC neurons. Based on the results of Experiment 1, maximal Fos expression in vmPFC after 2 extinction sessions, we used the Daun02 inactivation procedure (Koya et al., 2009b; Bossert et al., 2011; Cruz et al., 2014) to determine whether distinct Fos-expressing neuronal ensembles in vmPFC play a causal role in the recall of food reward memories and extinction memories. For this purpose, we added a short 15 min “induction” session, identical to the subsequent test session 3 d later, to first recall the putative food reward memories or extinction memories and then inactivate these memories using postsession Daun02 injections. We tested whether inactivating extinction and food reward memories during induction day would lead to increases and decreases in food seeking during the test day, respectively.
The design of Experiment 3 is shown in Figure 5A. Figure 5B shows cannula placements for these rats. Figure 5C shows the mean ± SEM number of pellets earned and active and inactive lever presses during the self-administration and extinction training phases. Rats rapidly learned to lever press for pellets during the self-administration phase and rapidly decreased lever pressing when pellet rewards were removed during the extinction phase (p values <0.01).
Daun02 inactivation of activated vmPFC neurons following 0 and 2 extinction sessions. A, Experimental timeline for the Daun02 inactivation experiment. We trained rats to self-administer food for 7 d and then divided them into 2 groups that underwent 0 or 2 extinction sessions. On induction day, we gave the rats a 15 min extinction session and bilaterally injected Daun02 or vehicle 90 min after the beginning of the session. We assessed food seeking under extinction conditions 3 d later. B, Data are mean ± SEM number of active lever presses, pellets earned, and inactive lever presses over 7 d of food self-administration and extinction training (n = 11 or 12 per group). C, Figure showing placement of cannulas. D, Mean ± SEM number of active lever presses over a 15 min extinction session on induction day. E, Mean ± SEM number of active lever presses over a 15 min extinction session on test day. F, Mean ± SEM number of β-Gal-positive nuclei per mm2 in the vmPFC 90 min after the test. *p < 0.05, different from vehicle (n = 11 or 12 per group).
On induction day, lever pressing was assessed for 15 min under extinction conditions before being placed in their home cages for 75 min and then injected with Daun02 or vehicle into the vmPFC (Fig. 5D). We hypothesized that the food self-administration memory or food reward memory was recalled following 0 extinction sessions, whereas the “extinction” memory was recalled following 2 extinction sessions. Rats that have not experienced extinction have the expectation of rewarded lever pressing, whereas rats that have experienced extinction are likely to recall the extinction memory during testing. Two-way ANOVA indicated a significant main effect of Prior extinction sessions (F(1,39) = 38.0, p < 0.01) but no difference between the injection groups that received Daun02 or vehicle, and no significant interaction between the two main factors (p > 0.05). Post hoc analysis showed that the rats that received 2 prior extinction sessions pressed the active lever less than rats that received 0 extinction sessions (p < 0.05).
On test day, we assessed the behavioral effects of prior Daun02 inactivation of the putative food self-administration-related and extinction-related neuronal ensembles (Fig. 5E). Two-way ANOVA indicated that active lever presses varied as a function of Prior extinction sessions (F(1,39) = 11.0, p < 0.01), but not Daun02 injection (p > 0.05); more importantly, there was a significant interaction between the Daun02 condition and Prior extinction sessions (F(2,30) = 6.8, p < 0.01). Post hoc analysis showed that Daun02 injections reduced active lever presses in rats with 0 prior extinction sessions but increased lever presses in rats that underwent 2 prior extinction sessions, compared with vehicle-treated controls (Fig. 5E, p < 0.05). Following testing, we perfused rats and assessed β-Gal expression in the vmPFC (Fig. 5F). Two-way ANOVA indicated that β-Gal expression did not vary as a function of Prior extinction sessions (p > 0.05), Daun02 injection (p > 0.05), nor was there a significant interaction between the two factors (p > 0.05).
Together, to the degree that Fos can serve as a marker of ensemble activation (Cruz et al., 2013), the results of Experiment 3, bidirectional modulation of food seeking during testing by Daun02 injections, suggest that distinct neuronal ensembles within the vmPFC encode food reward memories and extinction memories.
Experiment 4: effect of Daun02 inactivation of neurons in vmPFC selectively activated by recall of food self-administration memories: a replication
The goal of Experiment 4 was to confirm that our induction procedure for the rats previously exposed to 0 extinction sessions indeed activates the memories of food reward by directly comparing the effect of Daun02 injections on induction day in groups of rats in which lever presses did or did not result in food delivery. We predicted a similar behavioral effect of Daun02 injections in the two conditions (inhibition) during a test for food seeking 2 d later.
The design of Experiment 4 is shown in Figure 6A. Figure 6B shows cannula placements for these rats. Figure 6C shows the mean ± SEM number of pellets earned and active and inactive lever presses during the self-administration phase. Rats rapidly learned to lever press for pellets during the self-administration phase (p < 0.01).
Daun02 inactivation of activated vmPFC neurons during food self-administration training and 0 d extinction conditions. A, Experimental timeline for the Daun02 inactivation experiment. We divided into 2 groups: the No Food group was trained to self-administer food for 7 d followed by an additional 7 d of exposure to the self-administration box without levers; the Food group was trained to self-administer for 7 d only. On induction day, we gave the No Food rats a 15 min extinction session while the Food rats underwent 15 min of food self-administration, and then we injected all rats bilaterally with Daun02 or vehicle 75 min later. We assessed food seeking under extinction conditions 3 d later on test day. B, Data are mean ± SEM number of active lever presses, pellets earned, and inactive lever presses over 7 d of food self-administration and extinction training. C, Figure showing placement of cannulas. D, Mean ± SEM number of active lever presses over a 15 min extinction session on induction day. E, Mean ± SEM number of active lever presses over a 15 min extinction session on test day. F, Mean ± SEM number of β-Gal-positive nuclei per mm2 in the vmPFC 90 min after the test. *p < 0.05, different from vehicle (n = 11 or 12 per group).
On induction day, lever pressing was assessed for 15 min under either regular food self-administration training conditions or extinction conditions before being placed in their home cages for 75 min and then injected with Daun02 or vehicle into the vmPFC (Fig. 6D). We expected that the food self-administration memory would be recalled following both the brief food self-administration and extinction sessions with and without food, respectively. Two-way ANOVA indicated a significant main effect of presence of food reward during induction day (F(1,40) = 31.1, p < 0.01), but no difference between the Injection groups that subsequently received Daun02 or vehicle, and no significant interaction between the two factors (p > 0.05). Post hoc analysis showed that rats that received food reward on induction day pressed the active lever significantly more than rats that did not receive the food reward on induction day (p < 0.05).
On test day, we assessed the behavioral effects of prior Daun02 inactivation of the putative food self-administration-related neuronal ensembles. Two-way ANOVA indicated that active lever presses varied as a function of the presence of food reward during induction day (F(1,40) = 9.8, p < 0.01) and Daun02 injection (F(1,40) = 6.2, p < 0.05); the interaction between the two factors was not significant (p < 0.05). Post hoc analysis indicated that Daun02 injections reduced active lever presses in rats in both the Food and No food groups (p < 0.05) (Fig. 6E). Following testing, we perfused the rats and assessed β-Gal expression in the vmPFC (Fig. 6F). Two-way ANOVA indicated that β-Gal expression did not vary as a function of the number of prior extinction sessions (p > 0.05), Daun02 injection (p > 0.05), nor was there a significant interaction between the two factors (p > 0.05).
Together, the results of Experiment 4 replicate those of Experiment 3 and confirm the notion that the neuronal activation in vmPFC during the induction session in the 0 d extinction group is likely due to the recall of the food reward memories.
Discussion
We assessed the role of vmPFC neuronal ensembles in encoding food self-administration reward memories and extinction memories in an operant learning task. Rats learned to lever press for palatable food pellets during self-administration training and rapidly reduced lever pressing following 2 or 7 d of extinction training. In the absence of food reward on test day, we found that vmPFC Fos expression was higher following 2 prior extinction sessions than after 0 or 7 prior extinction sessions. We hypothesized that neuronal ensembles encoding the food reward memories were activated during the initial 15 min on test day following 0 prior extinction sessions, whereas neuronal ensembles encoding the extinction memories were activated on test day following 2 or 7 prior extinction sessions.
We tested this hypothesis using both region-wide inactivation with muscimol+baclofen and the Daun02 inactivation procedure that inactivates only Fos-expressing neurons that were previously activated 90 min before Daun02 injections. Daun02 inactivation of Fos-expressing neurons that were previously activated by the recall of food reward (0 prior extinction sessions) on induction day decreased nonreinforced food seeking on test day 3 d later. These data suggest that the activated neurons play a causal role in the recall of the food self-administration reward. Conversely, Daun02 inactivation of Fos-expressing neurons that were previously activated by extinction recall on induction day (2 prior extinction sessions) increased subsequent lever pressing in the absence of food reward on test day. These data suggest that the activated neurons encode the extinction memories. Overall, our data suggest that two different neuronal ensembles within the vmPFC encode distinct food reward memories and extinction memories.
In rats with 0 prior extinction sessions, we assumed that the first 15 min of lever pressing without food reward on test day (Experiment 1) or induction day (Experiment 3) activated the same food self-administration memory and neuronal ensembles as during the previous food self-administration training. In Experiment 4, we directly tested this assumption using Daun02 to inactivate neuronal ensembles that were previously activated during lever pressing for food reward in one group of rats or no food reward in another group of rats, and then assessed the effects of Daun02 inactivation on subsequent nonreinforced food seeking 3 d later on test day. We found that Daun02 inactivation on induction day decreased food seeking on test day in both groups. These data, which replicated the 0 extinction training group in Experiment 3, suggest that, without any prior extinction training, our rats recalled the food-reward memories.
It is important to note that, depending on the experimental conditions, Daun02 inactivation either decreased or increased lever pressing during testing, ruling out nonspecific behavioral effect of the drug injections. This pattern of results confirm that Daun02 inactivation acts on specific neuronal ensembles within a given brain area (Bossert et al., 2011; Fanous et al., 2012; Pfarr et al., 2015). We did not use a novel context control in our current study, as we had used in some of our previous studies (Fanous et al., 2012; Cruz et al., 2014). The novel context was previously used to activate a “random” ensemble of similar or larger size than the test ensemble, and then inactivate it with Daun02 to demonstrate that the effects of Daun02 inactivation are not a matter of how many neurons are inactivated, but rather which neurons are inactivated. In the current study, we provide even stronger evidence for distinct ensembles than that indicated by a novel context control, by showing that Daun02 inactivation has opposing effects on behavior based on the experimental conditions that presumably activated different ensembles on induction day. Furthermore, region-wide inactivation of vmPFC with muscimol+baclofen had no statistically significant effect on food seeking, further ruling out that the behavioral effects of Daun02 in our study are due to nonspecific effects on behavior. However, the data from the muscimol+baclofen experiment in reference to the effect of global inactivation of vmPFC on food seeking should be interpreted with caution because the n in this study (6 per group) was lower than in the Daun02 experiments. Based on the trend of the group means in the self-administration recall condition (0 extinction days), the inclusion of more rats may result in a significant effect of muscimol+baclofen. However, there was no trend for a muscimol+baclofen effect in the extinction recall groups (2 extinction days), suggesting qualitative different effects between global inactivation versus selective inactivation with Daun02.
Surprisingly, we found no difference between rats receiving Daun02 and those receiving vehicle when using XGal to indicate remaining β-Gal enzyme activity. This may be due to damage or inflammation from the cannulation procedure or that the XGal-labeled neurons activated on test day are evidence of rewiring of the remaining cue-activated afferents to a new set of neurons in the vmPFC. However, we do not have data to support either possibility without additional experiments.
Finally, we found that Fos expression in vmPFC was higher in the extinction recall condition in rats with a history of short (2 sessions) extinction than those with a history of prolonged (7 session) extinction (Figs. 2, 3). The reason for this difference, which did not reach formal statistical significance, is unknown but may reflect a gradual experience-dependent refinement of the ensemble, as it becomes more extinction-specific when the rats habituate to stimuli irrelevant to extinction learning during the course of extinction training.
Role of mPFC neuronal ensembles in reward self-administration and extinction of operant conditioning
The mPFC plays a critical role in learning and memory for both drug and natural rewards (Kalivas and Volkow, 2005; Phillips et al., 2008; Floresco, 2013). As mentioned in the Introduction, a popular hypothesis is that the dmPFC promotes reward seeking whereas vmPFC inhibits reward seeking (Peters et al., 2008b, 2009; Peters and De Vries, 2013). Several lines of evidence support this hypothesis. Fos immunoreactivity is elevated in dmPFC on the last day of operant conditioning for cocaine (Zavala et al., 2007), the first day of extinction of cocaine seeking (Nic Dhonnchadha et al., 2012), and after reinstatement of food and drug seeking (Neisewander et al., 2000; Cifani et al., 2012). Furthermore, pharmacological or optogenetic inhibition of the dmPFC decreases food and drug seeking after extinction (McFarland and Kalivas, 2001; Capriles et al., 2003; Nair et al., 2011; Calu et al., 2013; Stefanik et al., 2013). Conversely, Fos expression is increased in vmPFC during extinction of alcoholic beer seeking (Marchant et al., 2010) and cocaine seeking (Nic Dhonnchadha et al., 2012). Furthermore, lesions or inactivation of vmPFC induces reinstatement and potentiates spontaneous recovery of drug or food rewards (Rhodes and Killcross, 2004, 2007; Peters et al., 2009). However, results from other studies do not support that hypothesis of opposite roles of dmPFC versus vmPFC in promoting versus inhibiting reward seeking (Jonkman et al., 2009; Koya et al., 2009a; Bossert et al., 2011). Here, we find that pharmacological inactivation of the vmPFC had no effect on food seeking after extinction, further muddling this hypothesis.
One potential explanation for this discrepancy is that the above studies assessed the behavioral effects of site-specific pharmacological manipulations or lesions that interfere with neuronal activity in an entire brain region rather than selectively targeting sparsely distributing learning-related activated neurons (the neuronal ensembles) (Cruz et al., 2015). Therefore, these manipulations do not have the specificity of inactivating the specific neuronal ensembles that encode distinct learned behaviors.
Here, we reconcile some of these discrepancies by showing that separate neuronal ensembles within vmPFC can mediate both food reward memories and extinction memories, and consequently both promote and inhibit reward seeking. Furthermore, consistent with our previous findings (Bossert et al., 2011; Fanous et al., 2012; Cruz et al., 2013, 2014, 2015), we found that only a small percentage (∼4%-6%) of neurons within a given brain area play a role in operant reward seeking.
We used the Daun02 inactivation procedure to selectively inactivate neuronal ensembles within the vmPFC that were previously activated by the recall of either food reward memories or extinction memories. Inactivation of neurons activated by the recall of food reward memories decreased lever pressing in a subsequent test for food seeking. This finding is consistent with previous findings from our laboratory showing that inactivation of drug context-paired neuronal ensembles in the vmPFC disrupted reinstatement of drug seeking (Bossert et al., 2011). Additionally, we showed that inactivation of neurons previously activated by the recall of extinction memories increased lever pressing to the subsequent food seeking test. This finding suggests that neuronal ensembles within the vmPFC mediate extinction learning. This observation is in agreement with previous studies on vmPFC's role in extinction of operant and Pavlovian conditioning learned behaviors (Peters et al., 2009; Millan et al., 2011). We interpret our findings to indicate that two distinct neuronal ensembles mediate self-administration recall and extinction recall. However, we cannot rule out that the same neuronal ensembles have simply changed valence, such that two prior extinction sessions alter the causal role of the same ensemble from promoting lever pressing (self-administration memory) to repressing lever pressing (extinction memory).
Using in situ hybridization, we found that both pyramidal glutamatergic projection neurons and GABAergic interneurons were activated during the test sessions. Most of these neurons were Vglut1 (glutamatergic) cells that comprised up to 90% of all the activated Fos-expressing neurons in the vmPFC. Less than 10% were GABAergic (Vgat). Significant differences in cell type were not seen between activated neurons after 0, 2, or 7 prior extinction sessions. This suggests that similar proportions of projection and interneurons are activated by the recall of food reward memories and extinction memories. Because these neuronal ensembles involve largely heterogeneous populations of cells, this finding reinforces the idea that specific memories are encoded by specific patterns of neuronal activation that are largely cell-type independent (Bossert et al., 2011; Fanous et al., 2012; Cruz et al., 2014). It is currently unknown whether these populations of cells receive projections from or project to discrete brain regions. We are currently developing technologies capable of tackling this important question (Cruz et al., 2013), although the recently developed COIN strategy described by Thompson and Swanson (2010) could also be used to address this question.
In conclusion, we demonstrated a causal role for vmPFC neuronal ensembles in encoding food reward memories and extinction memories. Our experiments support the hypothesis that neuronal ensembles encoding opposing behaviors can coexist within the same brain area. Finally, our findings and those of Pfarr et al. (2015) suggest that conclusions regarding the role of specific brain areas and receptors within these areas in learned behaviors based on global inactivation of receptor blockade, independent of the neurons' activity state, may need to be reexamined by specific manipulations of neuronal ensembles that are activated during the learning tasks.
Footnotes
This work was supported by the Intramural Research Program/National Institute on Drug Abuse/National Institutes of Health.
The authors declare no competing financial interests.
- Correspondence should be addressed to Dr. Bruce T. Hope, Behavioral Neuroscience Branch, Intramural Research Program/National Institute on Drug Abuse/National Institutes of Health, 251 Bayview Boulevard, Suite 200, Baltimore, MD 21224. bhope{at}intra.nida.nih.gov