Abstract
The medial prefrontal cortex (mPFC) and nucleus accumbens (NAc) have been associated with the expression of adaptive and maladaptive behavior elicited by fear-related and drug-associated cues. However, reported effects of mPFC manipulations on cue-elicited natural reward-seeking and inhibition thereof have been varied, with few studies examining cortico-striatal contributions in tasks that require adaptive responding to cues signaling reward and punishment within the same session. The current study aimed to better elucidate the role of mPFC and NAc subdivisions, and their functional connectivity in cue-elicited adaptive responding using a novel discriminative cue responding task. Male Long–Evans rats learned to lever-press on a VR5 schedule for a discriminative cue signaling reward, and to avoid pressing the same lever in the presence of another cue signaling punishment. Postacquisition, prelimbic (PL) and infralimbic (IL) areas of the mPFC, NAc core, shell, PL-core, or IL-shell circuits were pharmacologically or chemogenetically inhibited while animals performed under (1) nonreinforced (extinction) conditions, where the appetitive and aversive cues were presented in alternating trials alone or as a compound stimulus; and (2) reinforced conditions, whereby cued responding was accompanied by associated outcomes. PL and IL inactivation attenuated nonreinforced and reinforced goal-directed cue responding, whereas NAc core and shell inactivation impaired nonreinforced responding for the appetitive, but not aversive cue. Furthermore, PL-core and IL-shell inhibition disinhibited nonreinforced but not reinforced cue responding. Our findings implicate the mPFC as a site of confluence of motivationally significant cues and outcomes, and in the regulation of nonreinforced cue responding via downstream NAc targets.
SIGNIFICANCE STATEMENT The ability to discriminate and respond appropriately to environmental cues that signal availability of reward or punishment is essential for survival. The medial prefrontal cortex (mPFC) and nucleus accumbens (NAc) have been implicated in adaptive and maladaptive behavior elicited by fear-related and drug-associated cues. However, less is known about the role they play in orchestrating adaptive responses to natural reward and punishment cues within the same behavioral task. Here, using a novel discriminative cue responding task combined with pharmacological or chemogenetic inhibition of mPFC, NAc and mPFC-NAc circuits, we report that mPFC is critically involved in responding to changing cued response-outcomes, both when the responses are reinforced, and nonreinforced. Furthermore, the mPFC coordinates nonreinforced discriminative cue responding by suppressing inappropriate responding via downstream NAc targets.
Introduction
The ability to discern, and adaptively respond to cues that signal reward and/or punishment outcomes is essential for survival. Successfully responding to changing cue-outcome contingencies requires the coordination of a complex set of processes, which includes the assignment of motivational value to environmental stimuli based on innate knowledge and learned experiences (Tooby and Cosmides, 1990; Ito and Lee, 2016). Maladaptive behavior can arise as a result of dysregulation in the processing of motivationally significant cues, as evident in fear/anxiety disorders and addiction (Aupperle and Paulus, 2010; Nguyen et al., 2015; Fricke and Vogel, 2020; McNamara and Ito, 2021), making it critical to elucidate the underlying neural circuits that govern cue-elicited adaptive behavior.
The medial prefrontal cortex (mPFC) and nucleus accumbens (NAc) are central components of the cortico-limbic-striatal system, ideally placed to regulate adaptive behavior motivated by both appetitive and aversive cues and outcomes (Balleine and Dickinson, 1998; Lang et al., 1998; Cardinal et al., 2002; Humphries and Prescott, 2010; Peters et al., 2009). They are interconnected in a topographical manner with the prelimbic cortex (PL) projecting predominantly to the NAc core, and the infralimbic cortex (IL) projecting selectively to the NAc shell (Sesack et al., 1989; Berendse et al., 1992; Vertes, 2004), likely forming functional pathways that enable the selection and elicitation of appropriate behavior, under the influence of limbic and dopaminergic inputs (Swanson, 1982; Hoover and Vertes, 2007; Pennartz et al., 2011). Indeed, a significant body of evidence suggests that the PL and IL subregions of the mPFC subserve dichotomous roles in the expression and extinction of conditioned fear responses and cue-induced drug-seeking, with the PL mediating the expression of conditioned freezing or cued drug-seeking, and the IL mediating the inhibition of freezing and cued drug-seeking, during and after, extinction learning (McFarland and Kalivas, 2001; Capriles et al., 2003; McLaughlin and See, 2003; Fuchs et al., 2005; Vidal-Gonzalez et al., 2006; Peters et al., 2008; Sotres-Bayon and Quirk, 2010; Van den Oever et al., 2010; Sierra-Mercado et al., 2011; LaLumiere et al., 2012; Gourley and Taylor, 2016). Further evidence suggests that this on/off switch-like function of the PL and IL in cue-elicited drug-seeking is effected via downstream NAc targets. Cocaine-elicited or cue-elicited reinstatement of cocaine-seeking is blocked by disrupting PL afferents to the NAc core (Cornish and Kalivas, 2000; Stefanik et al., 2013, 2016; McGlinchey et al., 2016; James et al., 2017), and conversely, by chemogenetically activating the IL-to-NAc shell circuit after extinction training (Augur et al., 2016).
However, there is mixed support for divergent PL and IL functions in cue-induced responding for other drugs and natural reinforcers, and in the inhibition of such responses in situations of nonreward or extinction (McLaughlin and See, 2003; Schmidt et al., 2005; Koya et al., 2009; Rogers et al., 2008; Mendoza et al., 2015; Pfarr et al., 2018; Caballero et al., 2019; Riaz et al., 2019; see Howland et al., 2022). Furthermore, very few studies have examined the role of cortico-striatal areas under situations in which animals are required to adapt their responses to changing cued response outcome contingencies (e.g., reward vs punishment) within the same test session.
The present study was therefore motivated by the need to bring clarity to the extent to which interconnected subregions of the mPFC and NAc become engaged in generating adaptive instrumental responding as informed by multiple discriminative cues of mixed valence. To this end, we designed a novel discriminative cue response task in which rats were trained daily to lever press for reward in the presence of a discriminative cue, and to inhibit pressing the same lever in the presence of a punishment cue within-session. Postacquisition, different groups of rats received direct infusions of the GABAA/B receptor agonists to inactivate the PL, IL, NAc core NAc shell, or clozapine-N-oxide (CNO) to pharmacogenetically inhibit PL-core and IL-shell pathways, before undergoing cue-induced responding with, and without reinforcement.
Materials and Methods
Subjects
A total of 98 adult male Long–Evans rats were used (Charles River), weighing between 350 and 400 g at the time of surgery. They were housed in pairs under a 12/12 h light/dark cycle, with lights turning off at 7 P.M. Experiments occurred during the light phase of the cycle. Water was available ad libitum, but 2 d before the start of behavioral testing, food was restricted to sufficiently maintain body weight between 85% and 90% of animals' free feeding weights. All animal procedures were performed in accordance with the ethical and legal requirements under Ontario's Animals for Research Act, the Canadian Council of Animal Care, and approval of the University of Toronto Local Animal Care Committee.
Surgery
For all experiments, animals were anaesthetized using 3–4% isoflurane and placed in a stereotaxic frame for surgery. For the pharmacological inactivation experiments, a 26-gauge stainless-steel bilateral guide cannula (Plastics One) was implanted into one of the following coordinates (in mm from bregma): prelimbic cortex (PL, AP = +2.2, ML = ±0.75, DV = −2.5), infralimbic cortex (IL; AP = +2.2, ML = ± 0.75, DV = −3.4) of the mPFC, caudal core (AP = +0.7, ML = ±1.5, DV = −5.7) or caudal shell (AP = +1.0, ML = ±1.0, DV = –6.5) of the NAc. For the chemogenetic experiments, the inhibitory DREADDs construct rAAV8-CAMKIIa-hM4Di-mCherry (Addgene) or control AAV8-CAMKiia-GFP (Addgene) was microinjected bilaterally into the PL (0.5 µl, AP = +2.2, ML = ±0.75, DV = −3.5) or IL (0.5 µl, AP = +2.2, ML= ±0.75, DV= −4.9) over 5 min, and the injector left in place for a further 5 min to ensure diffusion of the AAV away from the tip. Following AAV injection, bilateral guide cannulae were implanted into the caudal core (AP = +0.7, ML = ±1.5, DV = −5.7) or shell (AP = 1.0, ML = ±1.0, DV = −6.5) to allow for direct delivery of CNO before test sessions. All guide cannulae were affixed to the skull using dental cement and jeweler's screws. Stainless-steel obturators were inserted into the guide cannulae to maintain patency. Rats received injections of 3 mg/kg ketoprofen during surgery as an analgesic. Animals underwent a minimum recovery period of 7 d in their home cages before beginning experimental training.
Drugs and microinfusions
Animals were habituated to gentle hand restraint for 3 d before microinfusion test days in the manner and environment in which drug infusions would be administered. On the day before the first drug session, all animals received an infusion of the saline vehicle to minimize the mechanical effects of subsequent infusions and to further habituate the animals to the procedure. On infusion test days, animals received 0.3-µl bilateral intracerebral microinjections of a solution containing a mixture of the GABAA receptor agonist muscimol and the GABAB receptor agonist baclofen (75 ng of each drug per infusion, Sigma-Aldrich) dissolved in physiological saline, hM4Di receptor agonist CNO-dihydrochloride (1 mm, R&D Systems), dissolved in physiological saline (0.9%), or the saline vehicle only. The drugs were infused via 33-gauge microinjectors using an infusion pump (Harvard Apparatus) mounted with 5-µl Hamilton syringes. The injector tips extended by 1 mm below the guide cannulae in the PL, NAc core and shell and by 1.5 mm in the IL. The infusion occurred at a rate of 0.3 µl/46 s, and the injector was left in place for an additional 1 min to ensure complete diffusion of the drug from the injector tip. Approximately 10–15 min following the end of each infusion, behavioral tests were administered (described below).
Mixed valence discriminative cue responding task (Fig. 1)
Mixed valence discriminative cue responding task. a, Experimental timeline of the novel discriminative cue responding task. b, Rats were first trained to learn that lever pressing in the presence of a light stimulus leads to sucrose pellet reward on a fixed-ratio (FR1), then variable-ratio (VR5) schedule of reinforcement (App), lever pressing in the presence of a tone leads to the delivery of mild shock (Av), and lever pressing in the presence of a white noise has no programmed consequence (Neu). c, Upon successful demonstration of cue acquisition, in which App and AV and Neu cues were presented in extinction, (d) rats were microinfused with Muscimol/Baclofen (M/B) or saline into the PL or IL, NAc core or shell, or CNO (0.1 mm) or saline into the NAcc core or shell in PL-core and IL-shell pathway groups and administered a test session under extinction conditions (Cue Test, nonreinforced), in which they were presented with 10 trials each of the AP cue, Av cue, and App+Av cue, in randomized order. Rats were then given three retraining sessions, following which, (e) the same cue test was repeated, but this time with the alternate drug/saline. Rats underwent three further retraining sessions, and then underwent two further cue tests with outcomes (Cue Test, reinforced), with drug or saline administered before each.
A novel operant lever-pressing task was designed to measure the animals' motivation to respond to differential cues which signaled either appetitive, aversive, neutral, or competing response outcomes within the same behavioral paradigm.
Apparatus
Behavioral testing was conducted in 10 operant chambers (30.5 cm long × 24.1 cm wide × 29.2 cm high, Med Associates) illuminated by a house light (28v) and contained within a sound-attenuating box. Each chamber contained a shockable grid floor made of stainless-steel rods (0.5-cm-diameter rods, spaced 1.6 cm apart), and a side wall with two retractable levers, positioned equidistantly from a receptacle to which sucrose pellets were delivered (45 mg, TestDiet). Placed 3 cm above each lever was a round disk (2-cm diameter) that was illuminated by a light bulb, which served as a stimulus light. The chamber was also equipped with a tone generator (2 kHz Sonalert, Med Associates) and white noise generator tone generator (Med Associates) mounted high and low on the wall opposite the levers, respectively. All operations in the chambers were controlled via a computer with MED-PCIV software (Med Associates), which also automatically recorded the data generated during the experiment.
Pretraining
Training on the task began with two daily 15-min sessions of magazine training in the operant chamber, where the animals learned to collect sucrose pellets from the receptacle on a variable interval (VI) 20-s schedule. The rats then underwent 2 d of lever press training in 15-min sessions where pellets were dispensed on a fixed-ratio (FR)1 schedule, up to a maximum of 50 pellets earned daily.
Discriminative cue training (Fig. 1b)
The next phase was discriminative stimulus training, wherein the rats learned that three different stimuli (tone sound, white noise sound, or flashing light) indicated the availability of three different outcomes when/if the lever was pressed on an FR1 schedule of reinforcement. Thus, one lever press during the presentation of an “appetitive” cue led to the delivery of one sucrose pellet, while lever pressing during the “aversive” cue led to the presentation of a mild foot-shock (0.5s, 0.3 mA), and pressing during the neutral cue had no programmed contingency. Each daily session consisted of a total of 30 trials where the lever was extended into the operant chamber and the cue was presented for a 90-s duration, during which time the animal was free to press the lever. There were 10 interspersed trials for each of the discriminative cue types signaling either appetitive, aversive or neutral outcomes. After 6 daily sessions of discriminative cue responding under an FR1 schedule, the animals proceeded to the next phase of training in which they were reinforced on a variable ratio (VR)5 schedule of reinforcement during the appetitive, aversive and neutral cue presentations over 8 daily sessions.
Cue acquisition test (Fig. 1c)
Following the acquisition of discriminative cue responding, a probe test was conducted under extinction conditions to verify whether the animals had learned the associations between the cues and lever-pressing outcomes. The training test occurred in the same format as the training days, with ten 90-s trials of each discriminative stimulus presentation (appetitive, aversive, neutral), except that no outcomes were delivered. The animals were expected to exhibit the greatest amount of presses in response to the appetitive cue, an intermediate amount of presses to the neutral cue, and least to the aversive cue. After the probe test, the animals were retrained for 5 d on the VR5 schedule of reinforcement with outcomes present.
Discriminative cue test under nonreinforced conditions (Fig. 1d)
The following day, the animals underwent a discriminative cue test under extinction, which measured their responding to the presentation of the appetitive cue, aversive cue, as well as to a simultaneous presentation of the appetitive and aversive cues (compound cue). Ten 90-s trials occurred for each stimulus type (appetitive, aversive, compound) and lever press responses during stimulus presentations were recorded. Before this test, animals received intracerebral infusions of either the GABAA and GABAB receptor agonists muscimol/baclofen or saline into either the PL, IL, NAc shell or core for the inactivation experiments, and CNO or saline into the NAc core or shell for the chemogenetic experiment. The animals were then retrained over 3 d on the VR5 schedule of reinforcement. The cue test under extinction was then repeated in a within-subjects manner, with the animals undergoing a reversal in their respective drug/saline administration conditions. The animals were again retrained for 3 d on the VR5 schedule.
Discriminative cue test with reinforcement (Fig. 1e)
The following day, the animals underwent another discriminative cue test [appetitive/aversive/compound (appetitive+aversive) cues] in the same format as the previous extinction test, but now the outcomes were present after lever pressing. Before this test the animals were again infused with either muscimol/baclofen, CNO or saline. After 1 d of re-training on the VR5 schedule, animals underwent a final mixed valence test with outcomes present, with the drug and saline groups reversed in a within-subjects design.
Single appetitive cue task
A separate cohort of rats (PL: n = 5, IL: n = 7) was trained to undergo the same task, but with the goal to learn the meaning of the appetitive cue only. Thus, the appetitive cue operant task followed the same experimental course as the discriminative cue responding task. The number of training days, schedule of reinforcement, and outcome contingencies on testing days were equivalent between the tasks. Training and testing sessions included 10 trials of appetitive cue responding, as in the mixed valence discriminative cue task, but no additional trials for aversive, neutral, or compound cues were presented.
Outcome devaluation tests
Finally, we conducted a devaluation test on the animals from the NAc inactivation experiment (n = 18) to determine if the parameters of the mixed valence cue discrimination and operant appetitive cue task resulted in instrumental responding that was goal-directed or habitual in nature. The distinction was operationally defined by whether or not the lever-pressing behavior was sensitive to devaluation of the sucrose reward outcome by satiety (Adams and Dickinson, 1981).
At the end of the discriminative cue task, animals were retrained for 1 d on the VR5 schedule of reinforcement. The following day, half of the animals were placed in the devaluation condition whereby they were permitted to consume an unlimited amount of sucrose in their cages during a 1-h period. The other half did not receive any pellets. Immediately afterward, behavioral testing was performed under extinction conditions, in the same manner as during the discriminative cue responding or appetitive cue operant task. The next day the animals underwent a final day of VR5 training. The following day the animals experienced the alternate devaluation condition and were again tested for lever pressing under extinction conditions.
Histology
After completion of behavioral testing in pharmacological inactivation experiments, animals were injected with a lethal dose of pentobarbital, and perfused intracardially with 100-ml saline, followed by 100 ml of 4% paraformaldehyde (PFA) in PBS. Brains were removed and stored in PFA. Coronal slices of 50-µm diameter were cut with a vibratome, and then stained with cresyl violet for viewing under a microscope to verify the placement of cannulae.
For verification of drug spread, two experimentally naive animals with bilateral PL and IL cannulation (respectively) were infused with 0.3 µl of fluorophore-conjugated muscimol 15 min before perfusion. These brains were stored in PBS, sliced coronally (50-µm slices) and coverslipped with Fluoroshield Mounting medium with DAPI for the visualization of the fluorophore under a fluorescent microscope (visualized with a TRITC filter).
For the chemogenetic experiments, rats in the hM4Di groups were perfused transcardially 75 min after bilateral microinjections of 0.3-µl CNO or saline, and the extracted brains were placed in 4% PFA overnight. Brains were then cut in 50-µm slices using a vibratome (Leica VT1000, Wetzlar, Germany) and placed in 0.1 m PBS containing 0.1% sodium azide. For immunofluorescence processing, brain slices were gently agitated in three successive washes with PBS, followed by a 30-min wash in 1% H2O2 in PBS at room temperature, then another three PBS washes before being incubated overnight at 4°C with a polyclonal rabbit primary antibody (c-Fos, SYSY) at a 1:5000 concentration in Tris-NaCl-blocking buffer (TNB). The following day, slices were washed with PBS, an incubated with donkey anti-rabbit secondary antibody horseradish peroxidase (HRP) conjugate at a 1:500 concentration in TNB for 1 h. The brain slices were then washed with PBS again before undergoing a 30-min tyramide signal amplification (TSA) wash with fluorescein TSA to allow the cfos to be visualized in a different color than the fluorophore associated with the AAV. The TSA reagent was diluted to 1:500 in borate buffer and 0.01% H2O2. The brain slices underwent a final PBS wash (3 × 2 min) before being mounted and coverslipped with Fluoroshield mounting medium (containing DAPI as a nuclear stain). C-Fos immunoreactivity was visualized at 10× and 20× magnification using a Nikon microscope equipped for fluorescence microscopy. C-Fos-labeled cells were counted blind to the experimental conditions, at terminal sites in the NAc core or shell, and averaged across three samples for each rat.
Experimental design and statistical analysis
We used a mixed factorial design for all our experiments with region (PL vs IL, NAc core vs shell), or pathway (PL-core vs IL-shell) as the between-subject factor and drug (CNO/MB or saline) and/or cues (appetitive, aversive, compound) and/or trial (for nonreinforced cue test) as within-subject factors. All data were analyzed using statistical software R (R Core Team, 2021) and lme4 (Bates et al., 2015). For each reinforced discriminative cue test, the total number of responses emitted across 10 trials for each cue trial type (reward, shock, compound) was calculated. For each nonreinforced cue test (i.e., conducted under extinction), the total number of responses emitted across the 10 trials for each cue trial type (reward, shock, compound) was calculated. Additionally, responses emitted in each of 10 trials for each discriminative cue were subjected to a log transformation as they were not normally distributed. Linear mixed modeling (LMM) was then employed to analyze the outcome and response data generated in the reinforced and extinction test sessions, using drug (saline, CNO, or M/B), cue (appetitive, aversive, compound) and trial (10×) as fixed effects, and intercepts for subjects as random effects. Post hoc analysis was conducted by using emmeans package with Tukey familywise correction with the level of significance set at p < 0.05. Degrees of freedom were calculated using the Satterthwaite method.
Data availability
The raw data can be requested by contacting the lead contact.
Results
Verification of cannula placements and chemogenetic inhibition
For selective targeting of the prelimbic (PL) or infralimbic (IL) cortices of the mPFC, injector tips were confirmed to be in locations +3.2 to +2.2 anterior to bregma (Paxinos and Watson, 1998), within the confines of PL and IL (Fig. 2a). Drug spread within the PL and IL was estimated to be 0.1 mm (radius) per 0.1 µl of drug volume injected (0.3-mm radius total) with the use of 0.3 µl of fluorophore-conjugated muscimol, which is consistent with previous findings of drug spread measured using the same agent in the hippocampus (Schumacher et al., 2018; Riaz et al., 2019) as well as a functional assay of drug action as measured by cfos activation surrounding the injector tip in the NAc (Hamel et al., 2017). After three rats were removed from the study because of illness, final group numbers were as follows: PL, n = 15; IL, n = 12. Another cohort of rats (PL, n = 5; IL, n = 7) completed the appetitive cue only operant task. For the targeting of NAc shell and core areas, all injector tips were located in sections +0.7 to +1.0 anterior to bregma (Paxinos and Watson, 1998), within the boundaries of the shell and core (Fig. 3a). Two rats developed an infection, and were subsequently removed from the study. Final group numbers were: core (n = 10), shell (n = 8). For the chemogenetic mPFC-NAc circuit inhibition experiment, AAV (hM4Di mCherry and GFP) expression was verified at the site of the injection (PL or IL) and terminals in the NAc core or shell (Fig. 4a–d), and injector tips were verified to be correctly placed within the core and shell boundaries as defined above (Fig. 4c,d). Crucially, CNO-induced inhibition of the PL/IL terminals in the NAc core and shell was confirmed by a significant decrease in cfos activity in animals infused with CNO in the core and shell, compared with those infused with saline (core; t(9)=2.71, p < 0.025, shell; t(8) = 4.32, p < 0.0025; Fig. 4e–h). One animal in the IL-shell GFP group died soon after surgery and therefore could not proceed to behavioral testing. The final group numbers were: PL-Core hM4Di (n = 10), IL-Shell hM4Di (n = 10), PL-Core GFP (n = 7), IL-Shell GFP (n = 8).
PL and IL inactivation impairs discriminative responding to appetitive (App), aversive (Av), and compound (App + Av) cues. a, Schematic figures show the placement of the infusion cannula tips in the prelimbic cortex (PL) on the left and infralimbic cortex (IL) on the right, and the photographs show representative images of cannula placement (top) and the spread of fluorophore-conjugated Muscimol (bottom). b, All animals demonstrated successful learning of the discriminative cues. c, d, Muscimol-Baclofen (M/B)-induced inactivation of both the PL and IL diminished the total number of reinforced responding (expressed as the total number of responses ± SEM) during the App cue and increased reinforced responding during the Av cue. e, i, PL and IL inactivation reduced the total numbers of nonreinforced responses during the App cue presentation only. f–h, j–l, However, analysis of the trial-by-trial response data (Ln transformed responses ± SEM) from the extinction tests revealed that nonreinforced responding was significantly reduced during App cue and App+Av cue, but increased during Av cue presentations. m, n, A separate cohort of rats was trained with, and tested to respond during the presentation of a single appetitive cue only. Neither PL nor IL inactivation had any effect on reinforced or nonreinforced single appetitive cue responding. Significant drug effects are being indicated as *p < 0.05, **p < 0.01, ***p < 0.001, in c–l, across the two brain sites, following significant drug × cue interactions (p < 0.0001).
NAc core and shell inactivation impairs discriminative responding to appetitive (App) and compound (App + Av) cues under extinction, but NAc shell inactivation impairs reinforced responding during all cue presentations. a, Figures show the placements of the infusion cannula tips for NAc core and shell. b, All animals acquired discriminative cue responding as expected. c, d, GABAR-mediated (Muscimol/Baclofen, M/B) inactivation of the NAc shell, but not NAc core, impaired reinforced discriminative cue responding. e, i, Inactivation of the NAc core and shell reduced the total number of responses emitted during the appetitive cue. f–h, j–l, However, trial-by-trial data analysis revealed that discriminative responding during the compound cue was also significantly diminished, as well as the App cue following PL and IL inactivation. m, n, It was also established that prefeeding the animals with the sucrose pellets for 1 h before a discriminative cue test administered under extinction conditions significantly reduced responding during all cues, indicating that the animals' discriminative cue responding was under the control of action-outcome associations. All data are expressed as mean number of responses (or Ln-transformed for f–h,j–l) ± SEM. Significant drug effects are being indicated in d as *p < 0.05 and ***p < 0.001 while significant cue effects are depicted as: #p < 0.05 and ###p < 0.001. In panels e–l, significant drug effects are indicated as *p < 0.05, **p < 0.01, ***p < 0.001, across the two brain sites, following a significant drug × cue interaction (p < 0.0001).
PL-core and IL-shell circuit inhibition does not impact reinforced discriminative cue responding. a, b, Locations and expression of hM4Di (AAV-CAMKII-hM4Di-mCherry) virus in the PL-core and IL-shell projections. Representative 4× images at the top show mCherry/hM4Di expression in the PL or IL (sites of AAV infusion), and 40× images at the bottom show mCherry/hM4Di expression in projection areas NAc core and shell. c, d, Infusion cannula tip placements in the NAc core and shell are shown as red circles, and shaded areas (pink) show the largest extent of terminal viral expression observed. e, g, 40× images show cfos-positive cells in saline-administered versus Clozapine-N-Oxide (CNO) - administered rats of the NAc core (e) and shell (g). f, h, The number of cfos+ cells around the site of saline or CNO infusions were counted from 40× images in the NAc core and shell. There was a significant CNO-induced reduction in cfos+ cell counts in both regions (*p < 0.05, **p < 0.01). i, l, All animals successfully acquired cue discrimination. j, k, m, n, Reinforced discriminative cue responding was not affected by PL-core or IL-shell inhibition or CNO microinfusions into the NAc of GFP control animals. All data are expressed as mean cfos+ cell density or number of responses ± SEM.
Cue acquisition test
The acquisition of discriminative cue responding was assessed in a cue test administered in extinction before the administration of drugs (Figs. 2b, 3b, 4i,l). All animals acquired discriminative responding successfully (mPFC: cue, F(2,50) = 130.866, p < 0.0001, NAc: cue, F(1.26,20.09) = 147.92, p < 0.0001, hM4Di pathway: cue, F(1.06,20.21) = 144.76, p < 0.0001, GFP pathway: cue, F(1.07,13.87) = 134.41, p < 0.0001), with animals emitting significantly higher lever presses for the appetitive versus neutral cue (p < 0.0001), and for the neutral versus aversive cue (p < 0.0001), and for the appetitive versus aversive cue (p < 0.0001). There were no preexisting group differences in discriminative cue learning (mPFC: region, F(1,25) = 0.368, p = 0.550, cue × region F(2,50) = 0.141, p = 0.869, NAc: region, F(1,16) = 3.42, p = 0.083, cue × region F(1.26,20.09) = 0.05, p = 0.96, hM4Di pathway: region, F(1,19) = 0.26, p = 0.62, cue × region, F(1.06,20.21) = 0.33, p = 0.97, GFP pathway: region, F(1,13) = 0.012, p = 0.91, F(1.07,13.98) = 0.014, p = 0.99).
Following successful cue acquisition, rats were administered two rounds of cue tests, in which the appetitive, aversive, and compound (simultaneous presentation of appetitive and aversive cues) cues were presented under extinction (no outcome) and reinforced (outcomes given) conditions. In a within-subject design, each rat received one cue test with saline infusions into the target region, and another test with drug (M/B or CNO) infusions (with the order of testing counterbalanced across rats).
PL and IL inactivation attenuates adaptive responding to appetitive and aversive cues and outcomes
GABARA&B agonist-mediated inactivation of the mPFC led to a significant reduction in reinforced responding for the appetitive cue (App), and a contrasting increase in responding for the aversive cue (Av) compared with when saline was injected (drug × cue: F(2,130) = 18.46 p < 0.0001, post hoc comparisons for App and Av: p < 0.0001, App+Av p = 0.23; Fig. 2c,d), regardless of the target region PL or IL (region; F(1,26) = 1.59, p = 0.21; all interactions including region, p > 0.25).
Inactivation of both PL and IL led to a significant decrease in the total responses for the appetitive cue, but no change in aversive or compound cue responding under extinction conditions (drug × cue: F(2,1593) = 118.24, p < 0.0001, App; p < 0.0001, Av; p = 0.36, App+Av, p = 0.32; Fig. 2e,i). However, a separate analysis of the trial-by-trial cued responding revealed that M/B infusions into the PL and IL reduced mean numbers of responding during the appetitive and compound cue presentations, and increased responding during the aversive cue (Figs. 2f–h,j–l), and that responding for the appetitive and compound cues were altered to a greater degree than for the aversive cue after PL or IL inactivation (drug; F(1,1593) =163.99, p < 0.0001, drug × cue: F(2,1593) = 118.23, p < .0001, App and App+Av; p < 0.0001, Av; p < 0.03).
To examine whether PL or IL inactivation led to a deficit in appetitive cue responding per se, a different cohort of rats was trained on a version of the task in which they were trained on appetitive cue responding only. Rats were then tested for cued responding in extinction, and with outcomes present. There was no significant difference between drug and saline conditions when cued responding was reinforced in the PL or IL groups (drug: F(1,6) = 1.04 p = 0.35, drug × region: F(1,2) = 0.82, p = 0.46; Fig. 2m). Similarly, under nonreinforced conditions, it was found that there was no significant effect of PL or IL inactivation on appetitive cue responding (drug; F(1,6) = 1.72, p = 0.24, drug × region: F(1,2) = 2.47, p = 0.26; Fig. 2n).
Inactivation of NAc core and shell reduced responding for appetitive and compound cues in extinction, and NAc shell inactivation alone impaired reinforced discriminative cue responding
Pharmacological inactivation of the NAc core and shell led to differential effects on the total number of reinforced responses for the appetitive and mixed cue (drug × cue × region: F(2,85) = 4.57, p < 0.02; Fig. 3c,d), with M/B drug infusions into the shell, but not core, inducing significant decreases in responding for the appetitive and mixed cues compared with saline infusions (Shell: App, p < 0.0001, App+Av, p < 0.05, Av, p = 0.34, Core: all cues, p > 0.37). The ability of shell-inactivated animals to discriminate between cues was also significantly impaired (Shell saline: App vs Av, p < 0.0001, Av vs App+Av, p < 0.05; Shell drug: all p > 0.87).
Inactivation of the NAc core and shell under nonreinforced conditions led to an overall decrease in the total responses emitted for all cues (drug: F(1,80) = 5.84, p < 0.02; Fig. 3e,i), predominantly driven by a significant reduction in responding for the appetitive cue (drug × cue: F(2,1003) = 15.99, p < 0.0001, App: p < 0.0001, App+Av: p = 0.23, Av: p = 0.67), regardless of target region (region: F(1,16) = 3.47, p = 0.08, all interactions with region, p > 0.23). Separate analyses conducted on the trial-by-trial data further showed that NAc core and shell inactivation led to significant decreases in responding for appetitive and compound cues, but not aversive cue across trials (drug × cue: F(2,1121) = 22.57, p < 0.0001; App, p < 0.0001, App+Av, p < 0.001, Av, p = 0.92; Fig. 3f–h,j–l), regardless of target region (region: F(1,17) = 2.03, p = 0.17). However, the overall level of responding for the cues was significantly attenuated to a greater extent after NAc core inactivation, compared with shell inactivation (drug × region: F(1,1121) = 8.63, p < 0.01, core: p < 0.0001, shell: p < 0.03).
Discriminative cue responding was under the control of action-outcome associations
A reward devaluation test was also administered on the completion of all testing in the NAc cannulated animals to ascertain whether the animals' cued responding was under the control of action-outcome associations, as opposed to stimulus-response (habitual) associations. Before the test session, animals were given free access to the sucrose pellets in their home cages for 1 h (Fig. 3m). Animals consumed a mean of 20.00 g (SD = 6.06) of the pellets during the 1-h free consumption period. ANOVA of the devaluation test data indicated that discriminative cue responding was sensitive to the devaluation procedure (devaluation condition; F(1,17) = 49.229, p < 0.0001; Fig. 3n), with a reduction of lever pressing for the appetitive cue, as well as other cues (App, p < 0.0001, Av p < 0.01, App+Av, p < 0.0001).
Chemogenetic inhibition of the PL-core and IL-shell pathways increased responding to appetitive, aversive and compound cues only under extinction condition
Inhibiting the PL-core or IL-shell pathways did not have a significant impact on reinforced cued responding with all animals displaying discriminative responding for cues (cue: F(2,95) = 194.57, p < 0.0001; all analyses including drug: p > 0.40, all analyses including region; p > 0.39; Fig. 4j,k). However, inhibition of PL-core and IL-shell pathways led to significant increases in the total number of responses emitted under extinction (drug: F(1,100) = 14.67, p < 0.0001; Fig. 5a,e), although the ability to discriminate the cues remained intact (cue: F(2,100) = 104.66, p < 0.0001). The CNO-induced effect on total numbers of discriminative responding was primarily driven by an increased responding to the App cue (drug × cue; F(2,100) = 3.11, p < 0.05, App: p < 0.0001, Av: p = 0.12, App+Av: p = 0.39). Analyses of the trial-by-trial responses revealed that inhibition of the PL-core and IL-shell increased responding for all cues (drug: F(1,1180) = 32.10, p < 0.0001; Fig. 5b–d,f–h), although the potentiation was more pronounced following inhibition of the PL-core pathway (drug × region: F(1,1180) = 5.64, p < 0.02, drug vs saline: PL-Core: p < 0.0001, IL-shell: p < 0.03). Responses emitted for all cues declined over the session (trial: F(9,1180) = 57.70, p < 0.0001), regardless of the target region, or drug infusion (drug × region × trial: F(9,1180) = 1.34, p = 0.21).
PL-core and IL-shell circuit inhibition disinhibits nonreinforced discriminative cue responding. a, e, Chemogenetic inhibition of PL-core and IL-shell circuits induced increases in the total number of responses emitted during the presentation of all cues under extinction conditions. b–d, f–h, Trial-by-trial extinction data confirmed that responding during the presentation of App, Av, and Compound cues were elevated across the 10 trials following PL-core and IL-shell inhibition, although the effect of PL-core inhibition on responding was significantly stronger than that of IL-shell inhibition. i–p, CNO infusion into the NAc of PL-core and IL-shell GFP controls did not have any significant effects on reinforced and nonreinforced discriminative cue responding. All data are expressed as mean number of responses (or Ln-transformed for b–d, f–h, j–l, n–p) ± SEM. Significant drug effects are being indicated in a, e as ***p < 0.001, and in b–d, f–h as *p < 0.05, ***p < 0.001, across all cues following a significant drug × region interaction (p <0.01).
CNO infusions in GFP control animals did not affect reinforced discriminative cue responding, as compared with saline infusions (cue: F(2,74.16) = 363.12, p < 0.0001, all interactions including drug, p > 0.08, all interactions including region, p > 0.45; Fig. 4m,n). The GFP control groups also showed discriminative cue responding under extinction that was unaffected by CNO infusions (drug: F(1,75) = 3.42, p = 0.07, cue: F(2,75) = 88.92, p < 0.0001; Fig. 5i,m). However, analysis of the trial-by-trial data revealed that infusions with CNO led to significantly decreased overall levels of cue responding (drug: F(1,885) = 4.12, p < 0.05; Fig. 5j–l,n–p). However, this change in responding occurred in the opposite direction to that observed after CNO infusions in the hM4Di groups.
Discussion
The present study investigated the contributions of interconnected subregions of the mPFC and NAc to the ability to adapt instrumental responding to trial-by-trial changes in cued response- outcomes, under both reinforced (feedback provided), and nonreinforced (extinction) conditions. Using a novel mixed valence discriminative cue task in which rats were trained to lever press for sucrose reward in the presence of a discriminative light cue, and to withhold responding on the same lever in the presence of a discriminative tone cue signaling shock, transient GABAR-mediated inactivation of the PL and IL regions led to a loss of reinforced, and nonreinforced discriminative cue responding. In contrast, inactivation of the NAc core or shell selectively attenuated responding for the appetitive and compound (appetitive + aversive) cues under extinction conditions, and shell inactivation alone impaired reinforced discriminative cue responding. Finally, chemogenetic inhibition of the PL-core and IL-shell pathways induced disinhibition of discriminative responding to all cues under extinction conditions only. Together, our results implicate cortico-striatal circuits in mediating the co-ordination of adaptive responses to motivationally significant cues, especially under situations in which cue-elicited responding is not reinforced.
PL and IL are critical in adaptive responding to changing cue and outcome contingencies
In the present study, GABARA&B-mediated inactivation of both PL and IL impaired reinforced and nonreinforced cue responding, specifically under task conditions in which discrimination between cues signaling reward versus punishment was required, as the same PL and IL manipulations failed to have any effect when animals responded to a single appetitive cue only. Although PL and IL have been widely implicated in subserving go/stop function in conditioned drug-seeking and fear (see Gourley and Taylor, 2016), our findings corroborate a growing number of studies that suggest that PL and IL recruitment is equally important in cued responding under conditions in which multiple response outcomes need to be disambiguated and appropriate responding deployed (see Howland et al., 2022). Discriminative responding to cues signaling reward versus nonreward is consistently impaired following PL and IL inactivation under both reinforced and nonreinforced conditions, typically with reports of increased responding to a cue that signals no reward, and diminished responding to the cue signaling the availability of sucrose or cocaine reward (Ishikawa et al., 2008a, b; Ghazizadeh et al., 2012; Moorman and Aston-Jones, 2015; Gutman et al., 2017). The necessity of both PL and IL has also been demonstrated in a study examining Pavlovian responses to discriminative cues signaling reward, fear, and safety, where PL and IL inactivation led to diminished anticipatory nosepoke responses to the reward cue, reduced freezing to the fear cue, and attenuated cue discrimination (Sangha et al., 2014). Together with the present observations that PL and IL inactivation affected the animals' ability to respond appropriately to both appetitive and aversive cues by diminishing responding to the former, and increasing responding to the latter, there is compelling evidence that IL and PL subserve an important role in monitoring and adapting goal-directed behavior to changing contingencies in cued response outcomes by suppressing inappropriate actions, while promoting relevant ones.
Our results are also highly consistent with the notion that the mPFC acts as a site of confluence of motivationally significant cues and goals (Euston et al., 2012). Neural activity in the mPFC is modulated by cues that signal both appetitive and aversive outcomes, and by the outcomes themselves (Chang et al., 1997; Baeg et al., 2001; Pratt and Mizumori, 2001; Gilmartin and McEchron, 2005). It was noted in the present study, however, that under extinction conditions, responding to the appetitive cue was altered to a greater extent than to the aversive cue. Similarly, a recent electrophysiological study wherein rats were trained to discriminate responding for appetitive and aversive cues, demonstrated that PL and IL activity is more attuned to cues predictive of appetitive outcomes, relative to cues that predict aversive or no outcomes (Gentry and Roesch, 2018). A possible explanation for the observed bias for appetitive cues is that the cell recordings or manipulations in the present study had preferentially targeted local ensembles in the PL/IL concerned with appetitive cue processing. Existence of separate, but sometimes overlapping mPFC neuronal ensembles activated by cue-induced seeking of drug or natural reinforcers and inhibition of drug/reward-seeking has been reported (Bossert et al., 2011; Pfarr et al., 2015, 2018; Warren et al., 2019). However, whether such ensembles are organized in regionally defined clusters warrants further investigation.
Given that the PL is thought to be recruited in situations in which competing actions need to be resolved (Haddon and Killcross, 2005; De Wit et al., 2006; Marquis et al., 2007), we included the presentation of a compound cue (App+Av) in our test sessions, to assess the rats' ability to resolve competing motivational outcomes in the absence of Pl/IL. PL and IL-inactivated rats showed altered responses (reduced) to a compound cue presentation (App+Av) under extinction conditions only. However, we believe this effect to be because of appetitive cue-invoked responding being disproportionately impacted over aversive cued responding following PL/IL manipulations, rather than a deficit in the ability to resolve competing actions per se. Indeed, it was noted that responding to the appetitive cue was altered to a greater extent than to the aversive cue under extinction conditions in PL/IL-inactivated animals. This contrasted with our observation that PL/IL inactivation failed to have any effect on reinforced compound cue responding, presumably as the degree and extent of the effects of PL/IL inactivation on reinforced reward cue and aversive cue responding were equal and opposite. Thus, PL/IL inactivation-induced alterations in responding to the compound cue likely resulted from a composite of inactivation effects on appetitively and aversively motivated cue responding.
PL and IL coordinate nonreinforced discriminative cue responding via downstream NAc targets
Inactivation of the NAc core and shell led to differential changes in adaptive responding to discriminative cues, with inactivation of NAc shell, but not core, impacting reinforced discriminative cue responding. However, NAc core inactivation and shell inactivation (to a lesser extent) selectively diminished nonreinforced cue responding to the appetitive and compound cue, but not the aversive cue. These effects are reminiscent of known dissociations in the role of the NAc core and shell in reward-seeking, with the shell and its dopaminergic innervation posited to mediate reward-seeking and feeding behavior motivated by the unconditioned rewarding effects of drugs and natural reinforcers (Maldonado-Irizarry et al., 1995; Parkinson et al., 1999; Ito et al., 2000, 2004; Carlezon and Thomas, 2009), and the NAc core implicated in mediating the elicitation of instrumental and approach behaviors in response to reward-associated cues (Ito et al., 2000, 2004; Parkinson et al., 2000; Hall et al., 2001; Cardinal et al., 2002; Floresco et al., 2018). The observed lack of effect on aversive cue responding in extinction was unexpected, given that the caudal NAc is implicated in mediating passive conditioned avoidance and innate defensive behaviors (Reynolds and Berridge, 2002; Hamel et al., 2017). However, it is plausible that striatal substrates underlying cued avoidance of an instrumental response, as studied in the present experiment, is different from those involved in the expression of innate responses or passive avoidance.
Interestingly, despite the fact that inactivation of the NAc core and shell preferentially affected nonreinforced responding to the appetitive cue, chemogenetic inhibition of glutamatergic projections of the PL to NA core, and to a lesser extent, the IL to shell, nonspecifically increased/disinhibited nonreinforced responding to all cues. This points to mPFC-NAc circuits subserving a critical role in the coordination of adaptive responding when multiple learned cue contingencies are present. Similar disinhibition of responses during epochs that normally require the inhibition of sucrose reward-seeking, such as during nonreward stimulus presentations and/or pre-CS intervals, has been observed most typically with disruption to the IL-NAc shell, but also with manipulation of core-projecting PL (Ishikawa et al., 2008b; Ghazizadeh et al., 2012; Keistler et al., 2015). However, in these studies, responding to the discriminative stimulus signaling reward is unchanged (or elevated nonsignificantly) after circuit manipulation. In the present study, the disinhibition of non-reinforced responding to the appetitive cue was evident, revealing an inability of the animals to inhibit reward-seeking after a change in cue contingency (absence of reward). This is in keeping with the notion that mPFC-NAc circuits subserve a “gate”-like function in suppressing inappropriate or competing responses in the face of changing contingencies of motivationally significant cues.
Finally, the differential pattern of effects we observed with the GABAR-mediated inhibition of mPFC or NAc and chemogenetic inhibition of the mPFC-NAc circuits was somewhat surprising given that an increase in GABAergic tone in the mPFC and direct disruption of activity of glutamatergic projection neurons of the mPFC may be expected to produce largely similar effects. However, the divergent effects may, in part, be explained by the often-overlooked fact that mPFC projections neurons are not exclusively glutamatergic but also GABAergic (Tomioka et al., 2005). Pharmacological agonism of GABAA and GABAB receptors in the mPFC in the present study may therefore have modulated the activity of long-range mPFC GABAergic neurons in addition to inactivating glutamatergic projection neurons. Long-range NAc-projecting and BLA-projecting GABAergic neurons have indeed been identified in the mPFC and optogenetic stimulation of mPFC GABAergic terminals in the NAc has been shown to induce a state of aversion in a real time place preference task (Lee et al., 2014). Albeit the findings of Lee et al., are not directly comparable to the present findings because of clear procedural differences (optogenetics vs pharmacology), the possibility that NAc-projecting mPFC GABAergic neurons could make distinct and differential contributions to glutamatergic projections in the control of cued adaptive behavior requires further investigation.
In conclusion, we demonstrate that the PL and IL are critical in the discrimination and integration of multiple environmental cues that signal availability of reward and punishment, to enable adaptive responding. Specifically, the PL and IL are recruited to promote instrumental responding to an appetitive cue, and to suppress the same response to a punishment cue under conditions in which the actions are reinforced and nonreinforced. We also report that the PL and IL engage in the coordination of nonreinforced responses to appetitive and aversive cues via downstream NAc targets, core and shell, by suppressing competing and/or inappropriate actions in the face of changing cued response outcome contingencies. These findings provide novel insights into the conditions under which cortico-striatal circuits are recruited to govern cue-evoked adaptive behavior.
Footnotes
This work was supported by the Natural Science and Engineering Research Council of Canada Grant 402642 (to R.I.). We thank Dr. Maithe Arruda-Carvalho and members of the DEVsNeuro lab for their assistance with microscopy and Dr. Robert Rozeske for his insightful discussion of our findings.
The authors declare no competing financial interests.
- Correspondence should be addressed to Rutsuko Ito at rutsuko.ito{at}utoronto.ca