Abstract
The acquisition of goal-directed action requires encoding of the association between an action and its specific consequences or outcome. At a neural level, this encoding has been hypothesized to involve a prefrontal corticostriatal circuit involving the projection from the prelimbic cortex (PL) to the posterior dorsomedial striatum (pDMS); however, no direct evidence for this claim has been reported. In a series of experiments, we performed functional disconnection of this pathway using targeted lesions of the anterior corpus callosum to disrupt contralateral corticostriatal projections with asymmetrical lesions of the PL and/or pDMS to block plasticity in this circuit in rats. We first demonstrated that unilaterally blocking the PL input to the pDMS prevented the phosphorylation of extracellular signal-related kinase/mitogen activated protein kinase (pERK/pMAPK) induced by instrumental training. Next, we used a full bilateral disconnection of the PL from the pDMS and assessed goal-directed action using an outcome-devaluation test. Importantly, we found evidence that rats maintaining an ipsilateral and/or contralateral connection between the PL and the pDMS were able to acquire goal-directed actions. In contrast, bilateral PL–pDMS disconnection abolished the acquisition of goal-directed actions. Finally, we used a temporary pharmacological disconnection to disrupt PL inputs to the pDMS by infusing the NMDA antagonist dl-2-amino-5-phosphonopentanoic acid into the pDMS during instrumental training and found that this manipulation also disrupted goal-directed learning. These results establish that, in rats, the acquisition of new goal-directed actions depends on a prefrontal–corticostriatal circuit involving a connection between the PL and the pDMS.
SIGNIFICANCE STATEMENT It has been hypothesized that the prelimbic cortex (PL) and posterior dorsomedial striatum (pDMS) in rodents interact in a corticostriatal circuit to mediate goal-directed learning. However, no direct evidence supporting this claim has been reported. Using targeted lesions, we performed functional disconnection of the PL–pDMS pathway to assess its role in goal-directed learning. In the first experiment, we demonstrated that PL input to the pDMS is necessary for instrumental training-induced neuronal activity. Next, we disrupted ipsilateral, contralateral, or bilateral PL–pDMS connections and found that only bilateral PL–pDMS disconnection disrupted the acquisition of goal-directed actions, a finding we replicated in our final study using a pharmacological disconnection procedure.
- corpus callosum
- dorsomedial striatum
- goal-directed action
- instrumental conditioning
- outcome devaluation
- prelimbic cortex
Introduction
The acquisition of goal-directed actions requires the ability to encode the association between the action and its specific consequences or outcome. There is now extensive evidence that this form of learning involves distinct regions of the prefrontal cortex and the striatum (Balleine and O'Doherty, 2010). In the rat, the prelimbic (PL) region of the prefrontal cortex (Balleine and Dickinson, 1998; Corbit and Balleine, 2003; Killcross and Coutureau, 2003; Ostlund and Balleine, 2005; Tran-Tu-Yen et al., 2009) and the posterior dorsomedial striatum (pDMS; Yin et al., 2005a,b) have both been implicated in this acquisition process, suggesting that these structures may form part of a network mediating goal-directed learning.
There is now substantial evidence to suggest that corticostriatal projections arising from the PL to distinct regions of the dorsal striatum play a role in several forms of action selection; for example, Dunnett et al. (2005) demonstrated that bilateral disconnection of the PL from the dorsal striatum disrupted rats' performance in an instrumental delayed alternation task. More recently, Friedman et al. (2015) demonstrated that PL projections to the anterior dorsomedial striatum are necessary for cost–benefit decision-making under conflict. We have recently reported (Hart and Balleine, 2016) that PL neurons that project to the pDMS show heightened expression of phosphorylated mitogen-activated protein kinase/extracellular signal-related kinase (pMAPK/pERK), a cellular marker of learning and memory, following acquisition of a new instrumental action. Furthermore, we demonstrated that preventing ERK/MAPK phosphorylation in the PL in a period immediately after instrumental training both prevented goal-directed learning and reduced pERK/pMAPK expression in the pDMS (Hart and Balleine, 2016), suggesting that a direct projection from the PL to pDMS is necessary for this learning process.
In the current study, we more directly assessed the functional involvement of PL inputs to the pDMS in the acquisition of goal-directed actions using corticostriatal disconnection. Due to the bilateral nature of these projections, we adapted an approach developed by Dunnett et al. (2005) to induce partial corpus callosotomies in rats to sever the contralateral corticostriatal projection to the pDMS in a series of disconnection experiments where PL input to the pDMS was made exclusively contralateral or ipsilateral or was bilaterally abolished. We have previously reported that instrumental training induces the phosphorylation of MAPK/ERK in the pDMS (Hart and Balleine, 2016) and so here we initially sought to establish whether this training-induced phosphorylation required PL input by unilaterally disrupting that input to the pDMS, and assessing intrahemispheric differences in pMAPK/pERK expression after instrumental training. Next, we performed ipsilateral, contralateral, or bilateral disconnection of the PL from the pDMS and assessed the role of each of these projections in the acquisition of goal-directed action using a two-operant instrumental conditioning paradigm, and assessed goal-directed control using an outcome-devaluation test (Adams and Dickinson, 1981). We predicted that rats lacking all PL-to-pDMS projections would fail to acquire goal-directed actions. We sought further to determine whether such actions could be acquired in rats with exclusively ipsilateral or exclusively contralateral PL-to-pDMS projections. Finally, we used a temporary pharmacological disconnection to assess whether PL inputs to the pDMS were critical purely during instrumental training for goal-directed learning, and whether this input relied on NMDA receptor activation in the pDMS.
Materials and Methods
The current study was divided into four experiments investigating the role of PL inputs to the pDMS in goal-directed learning. In the first experiment, we used a partial corpus callosotomy to disconnect the contralateral PL–pDMS projections. We used the retrograde tracer Fluoro-Gold to demonstrate that this procedure is effective in eliminating contralateral projections without affecting ipsilateral projections. We then used this disconnection procedure to unilaterally block PL inputs to the pDMS and assessed for expression of pERK/pMAPK in each hemisphere of the pDMS following instrumental or yoked training. In the final two experiments, we used lesions and selective silencing of NMDA receptors via an infusion of dl-2-amino-5-phosphonopentanoic acid (dl-APV) to assess the functional role of ipsilateral, contralateral, and bilateral PL–pDMS projections in the acquisition of goal-directed actions.
Experiment 1: interhemispheric disconnection of PL–pDMS projection
Subjects
Subjects were nine experimentally naive male outbred Long–Evans rats (350–410 g before surgery) obtained from the Monash University Animal Research Platform. For all experiments, rats were housed in opaque plastic boxes with 2–4 rats per box of 2–4 in a climate-controlled colony room and maintained on a 12 h light/dark cycle (lights on at 7:00 A.M.). All experimental stages occurred during the light cycle. Water and standard laboratory chow were continuously available before the start of the experiment. All experimental and surgical procedures were approved by the Animal Ethics Committee at the University of Sydney, and are in accordance with the guidelines set out by the American Psychological Association for the treatment of animals in research.
Surgery
All rats received an infusion of the retrograde tracer Fluoro-Gold unilaterally into the pDMS and an electrolytic lesion or sham lesion of the corpus callosum (CC). For this and all subsequent experiments, rats were anesthetized with 3% inhalant isoflurane gas with oxygen, delivered at a rate of 0.5 L/min throughout surgery. Anesthetized rats were placed in a stereotaxic frame (Kopf). An incision was made down the midline of the skull and the scalp was retracted to expose the skull. Fluoro-Gold (30 mg/ml in saline; Fluorochrome) was infused into the pDMS via a Hamilton 1.0 μl glass syringe, at the following co-ordinates (from bregma): −0.4 mm anteroposterior (AP), ±2.2 mm mediolateral (ML), −3.5 mm dorsoventral (DV). A total volume of 0.5 μl was infused at a rate of 0.25 μl/min. After all surgical infusions, the infusion syringe was left in place for a further 4 min to allow for diffusion.
Fibers crossing the CC were disconnected via an electrolytic lesion because excitotoxic lesions are ineffective for axonal fibers. These lesions were conducted in seven rats using an electrode wire, which was insulated except for 1 mm at the tip. The exposed wire was lowered into the CC (+1.1 mm AP, 0 mm ML, −4.7 mm DV), and current of 7.5–10 V was run through the wire for 20 s. The remaining two rats received sham lesions. This involved lowering the electrode into the CC and then removing it without delivering any electrical current. For this and all experiments, rats were injected with a prophylactic (0.4 ml) dose of 300 mg/kg procaine penicillin immediately after surgery.
Histology and cellular quantification
Two weeks after surgery, rats were injected with a lethal dose of pentobarbital (0.8 ml, i.p.) and perfused transcardially with 400 ml of 4% paraformaldehyde (PFA) in 0.1 m phosphate buffer (PB). Brains were removed and placed in a vial of PFA solution and stored at 4°C for ≤3 d before being sliced into 30 μm horizontal sections on a vibratome, and mounted with Vectashield mounting medium (Vector Laboratories).
Placements of CC lesions and pDMS infusions were verified using a confocal fluorescent microscope (Olympus, BX16WI). Fluoro-Gold was excited with a 405 nm laser and the emission signal was detected within the bandwidth of 430–455 nm. A region ∼1 mm2 in the PL of one dorsal section (3.1 mm ventral to bregma) and one ventral section (4.5 mm ventral) of each brain was imaged using a 10× objective, and retrograde tracing was quantified using ImageJ software (Fiji; Schindelin et al., 2012).
Statistical analyses
Fluoro-Gold tracing was analyzed with planned, orthogonal contrasts to assess differences in expression across hemispheres and across layers (within-subjects), and within each layer in each hemisphere in Group Yoked versus Group Inst (between-subjects) controlling the per-contrast error rate at α = 0.05.
Experiment 2: the role of the PL in instrumental training-induced pMAPK/pERK expression in the pDMS
Subjects
Subjects were 18 experimentally naive male outbred Long–Evans rats (330–400 g before surgery) obtained from the same source and kept under the same conditions as those described previously.
Apparatus
For this and all subsequent experiments, training was conducted in 16 MED Associates operant chambers enclosed in sound-attenuating and light-attenuating cabinets. Each chamber was fitted with a pellet dispenser capable of delivering a 45 mg grain food pellet (Bioserve Biotechnologies) to a recessed magazine inside the chamber.
The chambers also contained two retractable levers that could be inserted individually on the left and right sides of the magazine. Head entries into the magazine were detected via an infrared photobeam. Unless otherwise stated, the operant chambers were fully illuminated during all experimental stages and illumination was provided by a 3W, 24 V house light located on the upper edge of the wall opposite the magazine. All training sessions were preprogrammed on two computers located in a separate room through the MED Associates software Med-PC. These computers also recorded the experimental data from each session.
Surgery
All rats received a lesion of the CC exactly as described in Experiment 1. All rats received a unilateral excitotoxic lesion of the PL by infusing NMDA (10 mg/ml in 1 m sterilized PBS) into the PL (+3.0 mm AP, ±0.7 mm ML, −4.0 mm DV; volume, 0.3 μl; rate, 0.1 μl/min).
Behavioral protocol and food restriction
Food restriction.
Following recovery from surgery, rats underwent 4 d of food restriction before the onset of the experiment. During this time, they received 5 mg of chow daily for the first 2 d, and 10 mg from the third day until the end of the experiment. Their weight was monitored daily to ensure it remained >85% of their presurgery body weight.
Magazine training.
Rats were given 2 d of magazine training (Days 1 and 2), which consisted of the delivery of 30 grain pellets into the magazine, one pellet at a time at a random interval averaging 60 s.
Instrumental training.
We trained rats on an interval schedule of reinforcement to minimize variability in session duration due to the temporal sensitivity of pERK expression. This paradigm has been used in our laboratory previously to examine pERK expression in pDMS neurons, and has been demonstrated to favor goal-directed responding (Shan et al., 2014). On Day 3, rats in Group Inst received instrumental training for grain pellets on a continuous reinforcement schedule (CRF) with a single lever, where every lever press earned 1 pellet, with the session terminating once 30 pellets had been acquired, or after 60 min. Rats that timed out at 60 min were given further training sessions until the criterion had been reached. On Day 4, the instrumental schedule was adjusted to random interval 15 s (RI15), in which lever presses were reinforced with a pellet delivery on average every 15 s. This schedule was adjusted again on Days 5 and 6 so that lever presses were reinforced at random intervals averaging 30 s (RI30). A Yoked control group received identical sessions, except pellet delivery was not contingent on instrumental performance; instead, pellets were delivered on schedules where the average interval was matched to that of Group Inst for the same session. Lever side was counterbalanced within groups.
Upon termination of the final training session, the house light remained on and the lever remained out, and each rat was briefly removed from the chamber and given a lethal injection of pentobarbital and returned to the chambers. This was done to minimize neuronal changes that may occur as a result of events signaling the end of the session.
Histology and immunofluorescence
Immediately after training on Day 4, rats were perfused transcardially with 400 ml of 4% PFA in 0.1 m PB. Brains were sliced on a vibratome into 30 μm sections and sections were stored in cryoprotective solution at −20°C.
A subset of sections from each rat were mounted onto slides and stained with cresyl violet to verify the locations and extent of lesions. Lesions were verified under a light microscope, using the boundaries defined by Paxinos and Watson (2014). Rats with misplaced lesions, extensive damage or infection, or incomplete lesions were excluded from the statistical analyses.
Immunofluorescence staining for phosphorylated ERK was conducted on three pDMS sections (−0.2 to −0.4 mm posterior to bregma) per rat. Sections were washed three times in 0.1 m Tris-buffered saline (TBS) with 0.2% NaF for 10 min each. Sections then underwent peroxidase inhibition, which consisted in submerging each section in TBS (NaF) with 1% methanol and 0.3% hydrogen peroxide for 5 min. Sections were again rinsed three times with TBS (NaF), before being submerged in TBS (NaF) with 2% Triton for 30 min to increase membrane permeabilization. Sections were again rinsed three times in TBS (NaF). The primary antibody was rabbit phospho-p44/42 MAPK (Erk1/2; Thr202/Tyr204; Cell Signaling Technology). Each section was submerged in 400 μl of primary antibody diluted in TBS (NaF) at a concentration of 1:300 for 48 h on a shaker at 4°C. Upon removal, sections were again rinsed three times with TBS (NaF) and then placed in secondary goat-anti-rabbit Alexa 488 (Invitrogen) diluted in TBS (NaF) at a concentration of 1:1000 for 2 h at room temperature. Sections were then removed, given two rinses with TBS and two with Tris Buffer (TB), each for 10 min, and mounted with ProLong Gold (Thermo Fisher Scientific). A region 1 mm2 was imaged in the pDMS in each hemisphere with a fluorescent confocal microscope using a 20× objective. Cells expressing pMAPK/pERK in each hemisphere were quantified using Fiji (Schindelin et al., 2012) and averaged across three sections for each rat.
Statistical analyses
Behavioral data were analyzed via a two-way (Training Day × Group) ANOVA. Quantification of pMAPK/pERK expression in the pDMS was analyzed via a two-way mixed ANOVA, testing for a within-subjects main effect of hemisphere (ipsilateral vs contralateral), between-subjects main effect of group (Inst vs Yoked), and the interaction (Side × Group). Separate orthogonal pairwise comparisons were conducted to assess differences in pMAPK/pERK expression in each hemisphere in each layer separately. For all analyses, α was controlled at 0.05.
Experiment 3: disconnection of the prefrontostriatal projection prevents goal-directed learning
Subjects
Subjects were 43 experimentally naive male outbred Long–Evans rats (300–400 g before surgery) obtained from the same source and housed under the same conditions as those previously described.
Apparatus
All training apparatus were the same as those described previously with the addition of a pump fitted with a syringe outside the chamber that delivered 20% sucrose (white sugar; Coles) diluted in H2O to a recessed magazine inside the chamber at a volume of 0.2 ml.
Surgery
Rats in Group CC received an electrolytic CC lesion as well as unilateral sham lesions of the PL and pDMS. Rats in Group CONTRA received a sham CC lesion and contralateral excitotoxic lesions of the PL and pDMS. Rats in Group IPSI+CC and CONTRA+CC received electrolytic lesions of the CC, as well as unilateral excitotoxic lesions of both the PL and pDMS, either in ipsilateral (Group IPSI) or contralateral (Group CONTRA) hemispheres. Excitotoxic lesions of the PL were conducted as described previously. Lesions of the pDMS were made by infusing NMDA (10 mg/ml in 1 m sterilized PBS) into the pDMS (−0.4 mm AP, ±2.2 mm ML, −4.5 mm DV; volume, 0.6 μl; rate, 0.1 μl/min). These co-ordinates were chosen based on pilot studies indicating that they should produce in these rats lesions that are at their widest at a plane that corresponds to −0.1 mm posterior in the atlas by Paxinos and Watson (2014). For sham lesions, the syringe was lowered into each structure and left in place for several minutes in the absence of any infusion. The location of lesions was counterbalanced across hemispheres within each group.
Behavioral protocol and food restriction
Food restriction.
Following recovery from surgery, rats underwent 4 d of food restriction before the onset of the experiment, as described previously.
Magazine training.
Rats were given 2 d of magazine training (Days 1 and 2), during which they received intermixed deliveries of grain pellets one at a time and 0.2mL of sucrose solution, delivered separately at random intervals averaging 60 s, and ending once 20 pellets and 20 portions of sucrose had been delivered.
Instrumental training.
The design of this experiment is presented in Figure 3A. Because rats received 11 d of instrumental training, we chose to use a random ratio (RR) schedule of reinforcement, which has been shown to favor goal-directed responding (Dickinson et al., 1983), thereby providing the most robust test for disruption of goal-directed actions. Instrumental training occurred across Days 3–13 on increasing RR schedules of reinforcement with both levers/outcomes. Each lever was extended for 2 × 10 min sessions (i.e., 4 × 10 min sessions in total) separated by 2.5 min of time out, during which the levers were retracted and the house light turned off. Rats were trained on Days 3 and 4 on CRF schedules. On Days 5–7, the schedule was set at RR-5 (i.e., each action delivered an outcome with a probability of 0.2), then increased to RR-10 (or a probability of 0.1) for Days 8–10, then finally to RR-20 (or a probability of 0.05) for Days 11–13. Order of lever presentation was counterbalanced across sessions within groups.
Outcome-devaluation and choice-extinction tests.
For the outcome-devaluation test, rats were placed in a separate set of chambers and provided with ad libitum access to one of the previously earned outcomes (pellets or sucrose) for 1 h. Rats were then immediately returned to the experimental chambers, where they were given a choice test in extinction with both levers for 10 min. This test was conducted twice (Days 14 and 15); once after devaluation of each outcome. If the rats' lever-press performance is goal-directed and so based on the specific action–outcome associations encoded during training, then their performance of the two actions on tests should reflect the current relative value of their consequences; i.e., they should avoid the action that, during training, delivered the now-devalued outcome and choose the other action.
Reinforced test.
On Day 16, rats were retrained for one session on RR-20. On Day 17, rats were prefed and tested in the manner described for the extinction test, with the exception that the test was 15 min long and lever-press responses resulted in the delivery of outcomes on the previously trained RR-20 schedule. Unlike the devaluation tests, rats were given only one reinforced test because the aim of this test was to establish sensitivity to specific satiety rather than instrumental learning per se; the effect of what is learned in Test 1 will clearly alter performance on any subsequent test, undermining the reliability of any interpretation.
Histology
At the conclusion of the experiment, rats were perfused in the manner described previously. Brains were postfixed for 1 h then placed in 0.1 m PBS with 20% sucrose overnight. Brains were cut into 40 μm coronal sections on a cryostat. Every fourth section was collected on a slide and stained with cresyl violet. Lesion placement was verified under a light microscope using the boundaries defined by Paxinos and Watson (2014). Lesions of the PL were characterized by marked cell loss; pDMS lesions also caused unilateral structural shrinkage and ventricle enlargement.
Statistical analyses
Instrumental pretraining and training data were analyzed using a two-way (Group × training day) ANOVA. Devaluation and reinforced tests were analyzed according to a two-way (Group × Devaluation) mixed ANOVA to compare overall response rates and the magnitude of devaluation in Group CONTRA+CC relative to the other three groups, and between the remaining three groups. Post hoc pairwise tests were conducted using Fisher's least significant difference (LSD) to follow up any significant ANOVAs. Individual pairwise comparisons in additional data were conducted using two-tailed t tests. All tests controlled α at 0.05.
Experiment 4: NMDA modulation of the prefrontostriatal projection during instrumental training mediates goal-directed learning
Subjects
Subjects were 78 experimentally naive male outbred Long–Evans rats (328–480 g before surgery) obtained from the same source and housed under the same conditions as those previously described.
Apparatus
All training apparatus were the same as those described previously with the addition of a pump fitted with a syringe outside the chamber that delivered 20% maltodextrin (Poly-Joule, Nutrica) diluted in H2O to a recessed magazine inside the chamber at a volume of 0.2 ml. The magazine and pump that delivered the maltodextrin solution were separate from the apparatus delivering the sucrose solution, so there was little risk of flavor contamination.
Surgery and infusions
All rats received a unilateral lesion of the PL, a lesion of the CC, and ipsilateral or contralateral implantation of a cannula into the pDMS. The surgical details regarding anesthesia and PL and CC lesions were the same as those described previously. For the cannulation, three jeweler's screws were inserted into the skull distributed laterally and rostrally from where the pDMS cannula would be inserted. A 6 mm guide cannula (Plastics One) was inserted into the pDMS at the following co-ordinates: −0.4 mm AP, ±2.2 mm ML, −3.5 mm DV. The guide cannula was held in place by dental cement, which was applied to the skull around the insertion point of the cannula, and over the jeweler's screws. A dummy cannula remained inside the guide cannula at all times except during infusions. The lesion and cannulation sides were counterbalanced across hemispheres within each group.
During infusions, infusion cannulae (6 + 1 mm projection; Plastics One) were inserted into the guide cannula and connected to a Harvard pump (Harvard Apparatus) with a Hamilton syringe (Hamilton) attached. Drug or vehicle (saline) was infused unilaterally at a rate of 0.15 μl/min for a total infusion volume of 0.5 μl. The infusion cannula was left for an additional minute to allow for diffusion.
Drug
dl-APV (Tocris Bioscience) was dissolved in 0.9% w/v isotonic saline to a concentration of 1 mg/ml. The vehicle solution was 0.9% w/v isotonic saline.
Behavioral protocol
Recovery and food-restriction procedures were the same as described previously.
Magazine training.
Rats were given 2 d of magazine training (Days 1 and 2); each day they received 30 deliveries of 20% maltodextrin solution into the magazine. These were given one at a time at random intervals averaging 60 s.
Instrumental pretraining.
The design of this experiment is presented in Figure 4A. All rats were pretrained to make two instrumental responses across 4 d, and then given 2 critical days of instrumental training in which they learned two new action–outcome associations. Because animals received only 2 d of critical training, we chose to use interval schedules of reinforcement to minimize variance between rats in response rate and outcome delivery. We have demonstrated several times (Hart and Balleine, 2016) that this training paradigm produces goal-directed responding in intact rats. All instrumental pretraining sessions terminated once 30 outcomes had been earned, or after 60 min. On the morning of Day 3, all rats received instrumental training for maltodextrin solution on a CRF schedule with a single lever. A second identical session was conducted in the afternoon of the same day with the alternative lever. On Day 4, rats again received two separate training sessions with maltodextrin on a RI15 schedule (responses are rewarded approximately once every 15 s). On Days 5 and 6, rats received identical training on RI30 (average interval between rewarded responses is 30 s) schedules. The order of lever presentations was counterbalanced within groups across days.
Instrumental training.
On Days 7 and 8, rats received two instrumental training sessions (one per day) with two new outcomes (grain pellets and sucrose solution). Each lever/outcome was trained individually within the same session on an RI30 schedule, where each lever was extended for 2 × 10 min sessions in alternating fashion (i.e., 4 × 10 min sessions in total) separated by a 1 min time out, during which the levers were retracted and the house light turned off. Immediately before each of these training days, rats were infused with either dl-APV or vehicle unilaterally into the pDMS. The order of lever and outcome presentations were fully counterbalanced within each group. Rats were given a small sample of each of the outcomes on Day 6 to reduce neophobia.
Outcome-devaluation and choice-extinction test.
Outcome-devaluation and extinction tests were conducted over the next 2 d in the manner described previously, except that the extinction tests were 5 min long.
Histology
Rats were perfused as described previously, and brains were postfixed and placed in 0.1 m PBS with 20% sucrose overnight. Brains were cut into 40 μm coronal sections on a cryostat. Every fourth section was collected on a slide and stained with cresyl violet. Lesion and cannula locations were verified under a light microscope using the boundaries defined by Paxinos and Watson (2014).
Statistical analyses
Instrumental training data were analyzed using a two-way (Group × Training Day) ANOVA. Devaluation was analyzed according to a two-way (Group × Devaluation) mixed ANOVA to compare overall response rates and the magnitude of devaluation in Group APV-CONTRA relative to the other three groups, and between the remaining three groups. Post hoc pairwise tests were conducted using Fisher LSD to follow up any significant ANOVAs. Individual pairwise comparisons in additional data were conducted using two tailed t tests. All tests controlled α at 0.05.
Results
Experiment 1: interhemispheric disconnection of PL–pDMS projections
We sought to sever axonal fibers in the interhemispheric pathway by administering a targeted electrolytic lesion to the CC. We injected Fluoro-Gold into the pDMS and quantified ipsilateral and contralateral retrogradely labeled neurons in the PL of rats with and without an electrolytic CC lesion.
Histology
All of the administered CC lesions included the most ventral portion. However, a subset (two) were confined to the posterior CC, leaving the anterior CC intact, whereas the remainder (five) were predominantly located in the anterior CC, leaving some of the posterior CC intact. Figure 1 shows a representative section from a control (intact; Fig. 1A) and an anterior CC lesion (Fig. 1B).
Interhemispheric disconnection of PL–pDMS projections. A, B, Horizontal 30 μm sections (−3.7 to −4.1 mm ventral to bregma) showing Fluoro-Gold injection in the pDMS in a rat with no CC lesion (A) and with an anterior CC lesion (B). Scale bars, 2000 μm. C, D, Retrograde Fluoro-Gold labeling in the PL ipsilateral (Ipsi; left) and contralateral (Contra; right) to the Fluoro-Gold injection in the intact section shown in A (C) and in the section with an anterior CC lesion shown in B (D). Scale bars, 100 μm. E, Mean Fluoro-Gold-labeled cells in Layer 2/3 and Layers 5–6 of the ipsilateral and contralateral PL in rats with an anterior CC lesion, posterior CC lesion, and no lesion (Intact). Error bars represent SEM, *p < 0.05.
Fluoro-Gold labeling
Retrograde labeling of the ipsilateral (ipsi) and contralateral (contra) PL in an intact brain and a brain with an anterior CC lesion is shown in Figure 1C,D. Figure 1E shows the mean number of Fluoro-Gold-positive cells in Layer 2/3 and Layers 5–6 of the ipsilateral and contralateral PL in rats with intact brains (Intact) or an anterior or posterior CC lesion. Consistent with existing literature (Gabbott et al., 2005), most retrogradely labeled neurons were located in Layers 5–6 with predominance in Layer 5; however, there was also a distinct, less dense band of neurons located predominantly in Layer 2/3 in the ipsilateral hemisphere. This superficial band was absent in the contralateral hemisphere, with the vast majority of neurons arising from Layer 5. Across all groups, there were significantly more labeled neurons in the ipsilateral than in the contralateral hemisphere (F(1,6) = 28.6, p = 0.002), and there were more labeled neurons in deeper layers than in superficial layers (F(1,6) = 89.6, p < 0.001). The CC lesion did not produce any disruption to the ipsilateral pathway; there were no differences between groups in the number of ipsilateral neurons labeled in any layer (F's < 1.0). However, the anterior CC lesion almost completely abolished contralateral PL labeling relative to rats with a posterior CC lesion and intact controls (F(1,6) = 17.8, p = 0.006), indicating that lesions that incorporate the most anterior part of the CC disrupted the interhemispheric PL–pDMS projection, consistent with reports by Dunnett et al. (2005).
Experiment 2: the role of the PL in instrumental training-induced pMAPK/pERK expression in the pDMS
Having established that we can disrupt contralateral PL input to the pDMS, we sought to assess the role of this input in the phosphorylation of MAPK/ERK following instrumental training. We used a design that allowed us to compare within-subjects pMAPK/pERK expression in the pDMS in one hemisphere that received PL input and in the contralateral hemisphere, which has been disconnected from the PL, in rats that received instrumental or yoked training. All rats received a unilateral lesion of the PL and a CC lesion. Hence, the pDMS in one hemisphere had no effective PL input, whereas the contralateral hemisphere retained ipsilateral PL input.
Histology
The criteria for inclusion of PL lesions in this and subsequent experiments were that the lesions must extend at least from 4.2 to 3.2 mm anterior to bregma, and they must not substantially extend into any neighboring structures. Based on these criteria, five rats from Group Inst and three rats from Group Yoked were excluded from the analysis due to misplaced lesions. Placements of PL and CC lesions for the remaining rats are presented in Figure 2A,B.
The role of the PL in instrumental training-induced pMAPK/pERK expression in the pDMS. A, B, Distribution and locations of the PL (A) and CC (B) lesions included in the analysis. C, D, Mean rate of magazine entries (C) and lever presses (D) per minute averaged across each day of instrumental/yoked training for each group. E, Representative images from the pDMS (coronal 30 μm section, −0.1 to −0.3 mm posterior to bregma), showing pERK expression in the ipsilateral (IPSI, left) and contralateral (CONTRA, right) hemispheres to the PL lesion in one rat from Group Inst (Inst, top) and one rat from Group Yoked (Yoked, bottom). Scale bar, 100 μm. F, Mean number of cells per section expressing pERK in the ipsilateral (IPSI) and contralateral (CONTRA) hemisphere to the PL lesion in Group Inst and Group Yoked. Bars represent group means. Error bars represent SEM, *p < 0.05.
Behavior
Figure 2C,D shows the rates of magazine entry (Fig. 2C) and lever pressing (Fig. 2D) across the 4 d of instrumental/yoked training. Across the 4 d of instrumental training, all rats showed substantial and stable rates of magazine entry, and there were no significant differences between groups (F's < 1.0). Rats in Group Inst (n = 4) showed a gradual increase across days in the average rate of lever pressing per minute. In contrast, rats in Group Yoked (n = 6) failed to show any such increase, and displayed little to no lever pressing during the 4 d of training. This was confirmed by a significant main effect of Group (F(1,8) = 31.49, p = 0.001) and a significant Group × Day (linear) interaction (F(1,8) = 56.9, p < 0.001).
Immunofluorescence
Figure 2E shows the extent of pMAPK/pERK expression in the pDMS in the hemisphere ipsilateral and contralateral to the PL lesion in a rat from Group Inst and a rat from Group Yoked. The average number of pERK cells across three pDMS sections was quantified for each rat in these groups. Mean pERK cells in the ipsilateral and contralateral pDMS to the PL lesion for each group are presented in Figure 2F. Group Inst did not differ significantly from Group Yoked in the total number of pERK cells, indicated by a nonsignificant main effect of Group (F(1,8) = 1.6). There was a significant main effect of side (ipsilateral vs contralateral; F(1,8) = 10.78, p = 0.01), and a significant Side × Group interaction (F(1,8) = 5.8, p = 0.04), indicating that the levels of pERK in the ipsilateral versus the contralateral hemisphere in the rats in Group Inst differed significantly from Group Yoked. Simple effects analysis revealed that this difference was driven by the significant difference in pERK expression on the ipsilateral versus contralateral side for Group Inst (F(1,8) = 13.4, p = 0.006). There was no significant difference in pERK expression in Group Yoked (F < 1.0).
Experiment 3: disconnection of the prefrontostriatal projection prevents goal-directed learning
We next sought to investigate the functional role of ipsilateral and contralateral PL–pDMS projections in the acquisition of goal-directed actions. To achieve this, we used a disconnection procedure in which the CC lesion was coupled with a unilateral lesion of the PL in one hemisphere and a unilateral lesion of the pDMS in the contralateral hemisphere (Group CONTRA+CC). Three additional groups were as follows: (1) Group CC: given a CC lesion with sham lesions of both the PL and pDMS; (2) Group IPSI+CC: given a CC lesion plus unilateral lesions of both the PL and pDMS in the same (i.e., the ipsilateral) hemisphere, leaving the contralateral hemisphere intact; (3) Group CONTRA: given the contralateral lesions of the PL and pDMS but with no CC lesion. Rats therefore had either one or both ipsilateral pathways intact (Group CC and Group IPSI+CC), one contralateral pathway intact (Group CONTRA), or no pathways intact (Group CONTRA+CC). A summary of the circuitry preserved in each group is presented in Figure 3B.
Disconnection of the prefrontostriatal projection prevents goal-directed learning. A, Summary of the experimental design. B, Summary of lesions in each group. C, Cresyl violet-stained sections (40 μm) showing a lesion of the PL (left), characterized by marked cell loss as outlined in left hemisphere; a CC lesion, indicated by tissue loss (middle); and a pDMS lesion, indicated by ventricle enlargement and structural shrinkage (right, right hemisphere). D, Distribution and location of the PL (left), CC (middle), and pDMS (right) lesions for all rats included in the analysis. E, Mean presses per minute averaged across each day of instrumental training for each group. F, Mean presses per minute on the valued and devalued lever averaged across the two tests for outcome devaluation under extinction. G, Mean presses per minute on the valued lever (gray bars) and devalued lever (red bars) during the first 2 min of Test 1 (left) and the last 2 min of Test 2 (right) of the extinction tests for devaluation. H, Mean presses per minute on the valued and devalued lever during the reinforced test for outcome devaluation. Error bars represent SEM, *p < 0.05.
Histology
Photomicrographs showing representative PL, CC, and pDMS lesions are presented in Figure 3C. Placements of the CC, PL, and pDMS lesions for all rats are presented in Figure 3D. As well as the criteria for PL lesions, rats were excluded if CC lesions did not include the most anterior portion, and pDMS lesions minimally had to include the area between 0.12 mm anterior and 0.36 mm posterior to bregma, without extending into the anterior DMS, the boundary of which we set at 0.48 mm anterior (Hart and Balleine, 2017), or into the dorsolateral striatum. A total of 26 rats were excluded from the experiment due to misplaced, incomplete, or too extensive lesions. Of these, 12 were excluded on the basis of misplaced or incomplete CC lesions, six were excluded due to misplaced or incomplete PL lesions, and one was excluded due to an incomplete pDMS lesion. The remaining seven were excluded due to two or three misplacements. After exclusions, 30 animals remained in the analysis (Group CC, n = 8; Group IPSI+CC, n = 8; Group CONTRA, n = 7; Group CONTRA+CC, n = 7).
Behavior
Instrumental training
The design of this experiment is summarized in Figure 3A. Rats were food-deprived and trained to press two freely available levers (R1 and R2) to earn two distinct food rewards: grain pellets and sucrose solution (O1 and O2). Figure 3E shows the rate of responding for each group across consecutive training days. All groups acquired lever-press responding for the two outcomes, and there were no significant differences between groups in their overall rates of responding, or rates of acquisition; the highest between-group difference was Groups CC and CONTRA (averaged) versus Groups IPSI+CC and CONTRA+CC (averaged; F(1,26) = 2.9, p = 0.1).
Devaluation test
To assess whether the instrumental responses were goal-directed, rats were tested using an outcome-devaluation procedure in which they were prefed on one or the other of the outcomes earned in training before being given a choice test with the two levers in extinction. If goal-directed learning depends on having at least one intact corticostriatal circuit, then we anticipated observing this effect in each of the control groups; since either the ipsilateral or contralateral prefrontostriatal projection was intact in each group. Meanwhile, we predicted that goal-directed learning would be abolished in rats in Group CONTRA+CC because neither the ipsilateral nor the contralateral connections were intact in this group, and that they would show no evidence of an outcome-devaluation effect as a consequence.
Figure 3F shows the mean rate of responding on the two levers during the choice-devaluation test. There were no differences in overall rates of responding between the three control groups (pairwise comparisons, highest F(1,26) = 2.06, p = 0.16, Group CC vs IPSI+CC). Likewise, despite a clear depression in responding in Group CONTRA+CC, this difference was not statistically significant when compared with each of the three control groups (CONTRA+CC vs CC F(1,26) = 3.8, p = 0.06; CONTRA+CC vs IPSI+CC F(1,26) = 0.33, p = 0.6; CONTRA+CC vs CONTRA F(1,26) = 2.05, p = 0.16). There was, however, a clear difference in the way responses were distributed between groups; rats with bilateral PL input to the pDMS (Group CC) or with unilateral PL input to the pDMS—either contralaterally (Group CONTRA) or ipsilaterally (Group IPSI+CC)—showed a clear outcome-devaluation effect, whereas rats with no PL input to the pDMS (Group CONTRA+CC) failed to show outcome devaluation. This observation was confirmed statistically by a significant Group (CONTRA+CC vs the controls) × Devaluation interaction (F(1,26) = 6.4, p = 0.02). There were no significant differences between the remaining three groups in the magnitude of devaluation (highest F(1,26) = 1.8, p = 0.2, Group CC vs CONTRA and IPSI+CC). Follow-up simple effects analysis (Fisher LSD) further showed that there was a significant devaluation effect for Group CC (p < 0.001), Group IPSI+CC (p = 0.0048), and Group CONTRA (p = 0.0046), but not for Group CONTRA+CC (p = 0.55; Fig. 3F).
To assess potential changes in the rate of extinction between groups, we additionally looked at the differences in responding across the first 2 min of the first extinction test and the last 2 min of the second test. The rate of responding on the first 2 min (left) and last 2 min (right) on the valued and devalued levers are shown in Figure 3G. It is clear from these data that rats in Group CONTRA+CC responded less on the valued lever from the start of the test, although this difference was not statistically significant (all F's < 2.3), and all groups showed clear extinction from the start of Test 1 to the end of Test 2 (main effect of time F(1,26) = 16.4, p < 0.001). There was, however, no interaction between the magnitude of extinction in Group CONTRA+CC and the other three groups (all F's(1,26) < 1.0). Finally, to determine whether rats in Group CONTRA+CC showed devaluation at the very start of the session, we assessed responding on the valued and devalued levers just for that group in the first 2 min of the test and confirmed that the failure to show devaluation was present from the start of the test (two-tailed t test t(6) = 0.55, p = 0.6).
Because rats in Group CONTRA+CC showed lower (although not statistically different) rates of responding during the extinction test, we additionally assessed performance in only the rats with the highest response rate from this group; of the seven rats in Group CONTRA+CC, four showed response rates >2 presses per minute on average. The mean average rate of responding by these four rats was greater than that of the IPSI+CC control (means: CONTRA+CC, 4.78; CC, 7.11; IPSI+CC, 4.45; CONTRA, 6.19). However, unlike the rats in the three control groups, these rats still failed to show outcome devaluation (mean response rate: valued, 5.4; devalued, 4.16; two-tailed t test t(3) = 1.61, p = 0.21, data not shown), indicating that the effect of corticostriatal disconnection on outcome devaluation is not due to a floor effect induced by a general reduction in response rate.
As noted, many rats were excluded from the analysis. So we looked at responding during the devaluation test for rats in Group CONTRA+CC excluded from the primary analysis on the basis of a single misplaced or incomplete lesion. Five rats met this criterium: one rat with an incomplete CC lesion, three rats with incomplete PL lesions, and one rat with an incomplete pDMS lesion. All rats responded more on the valued than on the devalued lever during testing. As a group, these rats showed a significant devaluation effect (two-tailed t test, t(4) = 12.16 p < 0.01), providing further anatomical confirmation that all three lesions are necessary to produce the deficit observed in Group CONTRA+CC.
Reinforced test
To assess whether differences in responding during the devaluation test reflected differences in goal-directed action or were due either to differences in sensitivity to the prefeeding or difficulty in discriminating between the two outcomes or levers, rats were given a follow-up test. This test was procedurally identical to the devaluation test except that instrumental responses now delivered their respective outcomes. As shown in Figure 3G, when rats were given feedback on their instrumental choices, all groups showed a robust preference for the valued lever (main effect of devaluation: F(1,26) = 23.6, p < 0.001) and there were no differences between groups in the magnitude of this preference or in the overall rates of responding (highest F = 3.0, p = 0.9, Groups CC and CONTRA vs Groups IPSI+CC and CONTRA+CC). This test confirmed, therefore, that rats with PL–pDMS disconnection were sensitive to the devaluation manipulation and were capable of discriminating among the outcomes and levers.
Experiment 4: NMDA modulation of the prefrontostriatal projection during instrumental training mediates goal-directed learning
Having shown that the PL input to the pDMS is required to acquire goal-directed actions, we next sought to assess when this input is required, and whether this input relies on NMDA receptor activation in the pDMS. As has been demonstrated, the PL is critical for the acquisition, but not expression, of goal-directed actions (Ostlund and Balleine, 2005), and that NMDA transmission in the pDMS is required during training for the development of goal-directed actions (Yin et al., 2005a). We hypothesized, therefore, that the activation of NMDA receptors on the PL–pDMS projection is necessary during instrumental training for the acquisition of goal-directed actions.
Histology
Four groups of rats received a unilateral PL lesion and CC lesion and had a cannula implanted into the pDMS either ipsilaterally or contralaterally to the PL lesion (Groups IPSI and CONTRA, respectively). A summary of the circuitry preserved in each group is presented in Figure 4B. The placements for the PL lesions, CC lesions, and pDMS cannulae are presented in Figure 4C. Forty-six rats were excluded from the primary analysis due to misplaced, incomplete, or too extensive lesion-induced damage; 15 were excluded on the basis of misplaced or incomplete CC lesions, six were excluded due to misplaced or incomplete PL lesions, and three were excluded due to misplaced pDMS cannulae. The remaining 22 were excluded due to two or three misplacements. After exclusions, 32 animals remained in the experiment (Group VEH-IPSI, n = 6; Group VEH-CONTRA, n = 8; Group APV-IPSI, n = 10; Group APV-CONRA, n = 8).
NMDA modulation of the prefrontostriatal projection during instrumental training mediates goal-directed learning. A, Summary of the experimental design. B, Summary of lesions and infusions in each group. C, Distribution and location of the PL (left), CC (middle), and pDMS (right) lesions or cannulae for all rats included in the analysis. D, Mean presses per minute averaged across each day of instrumental training for each group. E, Mean presses per minute on the valued and devalued lever averaged across the two tests for outcome devaluation under extinction. Error bars represent SEM, *p < 0.05.
Behavior
Instrumental training
The experimental design is summarized in Figure 4A. To establish a robust baseline rate of responding, rats first received instrumental pretraining on the two actions for a common outcome (i.e., maltodextrin; O3). Rats then received 2 d of instrumental training, during which each lever press action was paired with a unique outcome (pellets and sucrose; O1 and O2). Before each of these days of training, rats in Groups IPSI and CONTRA received either an infusion of the NMDA antagonist dl-APV (500 ng in 0.5 μl) or vehicle (0.5 μl) into the pDMS generating four groups: VEH-IPSI, APV-IPSI, VEH-CONTRA, and APV-CONTRA. As a consequence, NMDA transmission in the pDMS was disrupted either unilaterally either ipsilaterally (Group APV-IPSI) or contralaterally to the lesion (Group APV-CONTRA). Figure 4D shows the mean rate of responding across the 2 d of instrumental training. Despite the drug treatment, there were no differences in performance between groups on overall rates of responding or rate of acquisition across days (F's < 2.2).
Devaluation test
To assess whether NMDA receptors on the PL input to pDMS were necessary for the acquisition of goal-directed actions, rats were assessed (drug free) in an outcome-devaluation test. Test data are presented in Figure 4E; although all rats had 1 ipsilateral connection between the PL and pDMS intact during the outcome-devaluation test, rats in Group APV-CONTRA failed to show a significant devaluation effect, relative to the other three groups (Group × Devaluation interaction F(1,28) = 4.5, p = 0.04). Groups did not differ significantly in their overall response rates (APV-CONTRA vs controls; F(1,28) = 3.0, p = 0.09). It was apparent that group APV-IPSI had lower response rates and showed a smaller devaluation effect than rats in the VEH groups; however, this was not significant (main effect of Group APV-IPSI vs VEH; F(1,28) = 3.1, p = 0.09; Group × Devaluation interaction APV-IPSI vs VEH; F(1,28) = 2.3, p = 0.1), and there was no significant difference between the two VEH groups in magnitude of devaluation or overall response rate (F's < 1.0). Follow-up simple effects analysis (Fisher's LSD) further showed that there was a significant devaluation effect for Group VEH-IPSI (p = 0.003), APV-IPSI (p = 0.03), and VEH-CONTRA (p = 0.0005), but not for Group APV-CONTRA (p = 0.4).
We looked at responding during the devaluation test for rats in Group APV-CONTRA that were excluded from the primary analysis on the basis of a single misplaced or incomplete lesion. There were five rats that met this criterium: two rats with incomplete CC lesions, two rats with incomplete PL lesions, and one rat with a misplaced pDMS cannula. All rats responded more on the valued than the devalued lever during testing and, as a group, these rats showed a significant devaluation effect (two-tailed t test, t(4) = 3.75 p = 0.01). Together, these results suggest that NMDA-related activity in the PL–pDMS pathway is necessary for goal-directed learning.
Discussion
The current experiments provide the first direct evidence that the PL–pDMS pathway is necessary for goal-directed learning. Initially, we used a retrograde tracer to demonstrate that contralateral projections from the PL to the pDMS were disrupted by a partial corpus callosotomy that included the ventral part of the anterior CC. Having established that interhemispheric PL–pDMS projections could be severed in this manner, we used this disconnection procedure to demonstrate, within subjects, that instrumental training-induced pERK expression in the pDMS relies on input from the PL. We next investigated the functional role of the PL–pDMS pathway in the acquisition of goal-directed action by inducing, along with CC lesions, ipsilateral or contralateral lesions of the PL and pDMS. We found that rats with complete disconnection of the PL from the pDMS, induced by contralateral lesions of the PL and pDMS along with a CC lesion, failed to show a reliable outcome-devaluation effect, indicating that they had failed to acquire goal-directed action. In contrast, rats with ipsilateral PL input to the pDMS or exclusively contralateral input were able to acquire and express goal-directed learning. Finally, we used a pharmacological disconnection procedure to temporarily block the PL input to the pDMS by blocking NMDA transmission only during instrumental training, and again produced a deficit in outcome devaluation, confirming that the PL input to the pDMS is required during goal-directed learning and depends on NMDA-related activity in the pDMS.
PL input to the pDMS is required for instrumental training-induced pERK expression.
It has now been shown several times (Shiflett et al., 2010; Shan et al., 2014; Hart and Balleine, 2016) that instrumental training induces heightened pERK/pMAPK in the pDMS relative to yoked controls. Furthermore, pERK/pMAPK expression in the pDMS is necessary for goal-directed learning (Shiflett et al., 2010). Here, we demonstrated that the instrumental training-induced increase in pERK/pMAPK relies on the PL input to the pDMS. It has been argued that ERK/MAPK phosphorylation in the striatum is increased during coincident dopamine and glutamate signaling (Girault et al., 2007; Shiflett and Balleine, 2011). In this regard, we hypothesize that glutamate signaling generated by the PL input to the pDMS in combination with dopamine signaling during response–outcome pairings produced increased pERK/pMAPK and the consequent plasticity-related changes associated with goal-directed learning. Importantly, the PL input to the pDMS had no impact on pERK/pMAPK expression in rats given yoked training.
We have recently reported (Hart and Balleine, 2016) that instrumental training produces a transient increase in pERK expression in neurons that project to the pDMS from Layers 5–6 of the posterior PL 5 min after training and from Layer 2 of the anterior PL 60 min after training. We hypothesized (Hart and Balleine, 2016) that these neuronal populations may be differentially involved in learning versus consolidation of goal-directed actions; specifically, we suggested that within-session detection of the response–outcome contingency drives striatal plasticity via activation of the primary corticostriatal pathway from PL Layer 5, which then feeds back to neurons in PL Layer 2 via thalamocortical feedback circuits (Cruikshank et al., 2012; Little and Carter, 2012). We also suggested that across the minutes to hours after learning, a second phase of activation occurs in anterior PL Layer 2 neurons, reflecting the consolidation of the recently acquired goal-directed actions, which is updated in the pDMS during the next instrumental training session. This hypothesis implies, therefore, that the PL–pDMS pathway is particularly important for the initial acquisition of goal-directed actions, and that this learning is subsequently transferred to the pDMS via a recurrent corticostriatal–thalamic feedback circuit (Hart and Balleine, 2016). Note, however, that the present studies do not differentiate between anterior and posterior regions of the PL; the lesions generally extended throughout both regions.
Bilateral PL input to pDMS mediates goal-directed learning
The conclusion that corticostriatal disconnection, using asymmetrical lesions of PL and pDMS, abolished goal-directed learning was based on the failure of rats to subsequently show sensitivity to outcome devaluation in an extinction test. Importantly, these same rats were able to show sensitivity to outcome devaluation in a reinforced test; i.e., when the outcomes were delivered and the rats did not have to recall the action–outcome contingencies from training to modify their performance. As such, bilateral PL–pDMS disconnection did not alter response or outcome discrimination or sensitivity to specific satiety-induced devaluation per se. Rather, the deficit observed when these rats were tested in extinction is indicative of a more specific effect on their ability to encode the specific action–outcome associations to which they were exposed during training.
In the final experiment, we used a temporary pharmacological disconnection to directly assess this possibility. Previous reports have indicated that the PL is critical for goal-directed learning, but not for the expression of goal-directed actions at test (Ostlund and Balleine, 2005; Tran-Tu-Yen et al., 2009). We predicted, therefore, that PL-related plasticity in the pDMS is specifically required during training. To test this, rats were pretrained to make two instrumental responses for a common outcome and, during the critical instrumental training phase, each of these two responses was paired with a new outcome after an infusion of DL-APV to block NMDA receptors in the pDMS. Rats were then given an outcome-devaluation test drug-free when the PL input to the pDMS was intact, suggesting they had failed to encode the new action–outcome associations during training. An alternative explanation is that inactivation of the PL input to the pDMS disrupted some aspect of outcome–identity learning, rather than action–outcome learning, such that the rats failed to correctly update previously learned contingencies during instrumental training (cf. Bradfield et al., 2013). Although this remains a possibility, the finding in Experiment 3 that rats with PL–pDMS disconnection were impaired in learning the initial response–outcome contingencies suggests that the deficit reflects a failure to encode the new contingencies introduced in Experiment 4 rather than interference from prior learning.
Finally, although the current studies demonstrate that the connection between the PL and pDMS is necessary to acquire response–outcome associations, they do not establish that this is via a direct monosynaptic projection; we cannot rule out the possibility that this procedure also disrupted indirect projections via an intermediate structure, such as the basolateral amygdala (but see Coutureau et al., 2009). Nevertheless, this approach demonstrated that rats with exclusively ipsilateral or exclusively contralateral inputs from the PL to the pDMS were capable of acquiring goal-directed actions. Projections from the PL to the pDMS can be distinguished based on whether they travel contralaterally: pyramidal tract (PT) neurons project to the brainstem and generate only ipsilateral collaterals in the striatum, whereas intratelencephalic (IT) neurons project both ipsilaterally and contralaterally to the striatum via the CC (Lévesque and Parent, 1998; Shepherd, 2013). Our results strongly suggest, therefore, that goal-directed learning is mediated via IT neuronal input from the PL to the pDMS. However, pathway-specific viral manipulations will be necessary to rule out the involvement of an intermediate structure. Interestingly, ipsilateral (putative PT) projections from the PL to the dorsal striatum have been shown to be necessary for tasks that require integration of attentional, associative, and working-memory information with action selection (Christakou et al., 2001; Baker and Ragozzino, 2014). It is unclear what component of these complex tasks required the recruitment of this corticostriatal pathway. However, one possibility is that the ipsilateral projections are important to use higher-order cues as discriminative stimuli to modulate response selection (Baker and Ragozzino, 2014; Sharpe and Killcross, 2014) or perceptual attentional set shifting (Birrell and Brown, 2000). Alternatively, or perhaps additionally, such projections may be critical for response vigor. A consistent observation, particularly in Experiment 3, was that performance was generally suppressed in both the IPSI+CC and CONTRA+CC groups, which suggested to us that the suppression (1) was likely due to the addition of PT neuron ablation to the effect on the IT projection and, more importantly, (2) appeared to be independent of the effects of disconnection on goal-directed learning.
Distinct corticostriatal circuits mediate acquisition and expression of goal-directed actions
As noted, several lesion and inactivation studies have demonstrated that the PL is not required for the expression of goal-directed learning in tasks similar to the one used here (Ostlund and Balleine, 2005; Tran-Tu-Yen et al., 2009). In contrast, ipsilateral basolateral amygdala (BLA)-to-pDMS projections are required for both the acquisition and the expression of goal-directed action (Corbit et al., 2013), and it has recently been shown that the medial portion of the orbitofrontal cortex (mOFC) is necessary for retrieving absent outcome representations to guide goal-directed action (Bradfield et al., 2015). It is an attractive hypothesis that multiple glutamatergic inputs onto pDMS projection neurons are required to encode the action–outcome association, with the BLA likely contributing information regarding outcome value, embedded within the sensory-specific properties of the outcome (Balleine and Killcross, 2006). We have argued (Hart et al., 2014) that this information is retained within the pDMS, and goal-directed action is subsequently expressed according to the current incentive value of the available outcomes, encoded by the BLA and retrieved by the gustatory cortex (Parkes and Balleine, 2013). Notably, the BLA, gustatory cortex, and mOFC all have strong projections to the nucleus accumbens core, which is also critical for outcome devaluation (Corbit et al., 2001) suggesting a simple framework within which to understand the current results in the context of the broader corticostriatal circuitry.
Footnotes
This work was supported by the Australian Research Council (Grant DP150104878), the National Health and Medical Research Council of Australia (Grant GNT1089252), and by a Senior Principal Research Fellowship from the National Health and Medical Research Council of Australia (to B.W.B., GNT1079561).
The authors declare no competing financial interests.
- Correspondence should be addressed to Bernard Balleine, Decision Neuroscience Laboratory, Level 4, Matthews Building, School of Psychology, University of New South Wales, Kensington, NSW 2052, Australia. bernard.balleine{at}unsw.edu.au