Abstract
The orbitofrontal cortex (OFC) is known to play a crucial role in learning the consequences of specific events. However, the contribution of OFC thalamic inputs to these processes is largely unknown. Using a tract-tracing approach, we first demonstrated that the submedius nucleus (Sub) shares extensive reciprocal connections with the OFC. We then compared the effects of excitotoxic lesions of the Sub or the OFC on the ability of rats to use outcome identity to direct responding. We found that neither OFC nor Sub lesions interfered with the basic differential outcomes effect. However, more specific tests revealed that OFC rats, but not Sub rats, were disproportionally relying on the outcome, rather than on the discriminative stimulus, to guide behavior, which is consistent with the view that the OFC integrates information about predictive cues. In subsequent experiments using a Pavlovian contingency degradation procedure, we found that both OFC and Sub lesions produced a severe deficit in the ability to update Pavlovian associations. Altogether, the submedius therefore appears as a functionally relevant thalamic component in a circuit dedicated to the integration of predictive cues to guide behavior, previously conceived as essentially dependent on orbitofrontal functions.
SIGNIFICANCE STATEMENT In the present study, we identify a largely unknown thalamic region, the submedius nucleus, as a new functionally relevant component in a circuit supporting the flexible use of predictive cues. Such abilities were previously conceived as largely dependent on the orbitofrontal cortex. Interestingly, this echoes recent findings in the field showing, in research involving an instrumental setup, an additional involvement of another thalamic nuclei, the parafascicular nucleus, when correct responding requires an element of flexibility (Bradfield et al., 2013a). Therefore, the present contribution supports the emerging view that limbic thalamic nuclei may contribute critically to adaptive responding when an element of flexibility is required after the establishment of initial learning.
Introduction
Survival of living organisms depends on the ability to flexibly select and engage in actions appropriate for an organism's needs or desires. Learning adaptive behaviors involves a combination of instrumental and Pavlovian processes, the latter being especially important when behaviorally relevant outcomes may be associated with particular stimuli. Prefrontal contributions to these processes can be dissociated, with the medial prefrontal cortex being more involved in the acquisition of instrumental associations (Killcross and Coutureau, 2003; Coutureau et al., 2009; Tran-Tu-Yen et al., 2009) and the orbitofrontal cortex (OFC) in Pavlovian associations (Ostlund and Balleine, 2007a; Balleine et al., 2011). Successful action selection, however, requires the integration of information from multiple brain structures, including the striatum (Yin et al., 2005; Corbit and Janak, 2010), the basolateral amygdala (Balleine et al., 2003; Coutureau et al., 2009; Parkes and Balleine, 2013), and, as more recently evidenced, several limbic thalamic nuclei (Corbit et al., 2003; Mitchell et al., 2007; Ostlund and Balleine, 2008; Bradfield et al., 2013a; Alcaraz et al., 2014; Parnaudeau et al., 2015).
Limbic thalamic nuclei typically exhibit specific connectivity with cortical areas, especially with the prefrontal region. Regarding the thalamic projections connecting to the OFC, the mediodorsal thalamus appears as a major source (Groenewegen, 1988; Bradfield et al., 2013b). Although there is a suggestion that this thalamic region may support the use of Pavlovian contingencies (Ostlund and Balleine, 2008), its functional involvement is more evident during instrumental learning (Corbit et al., 2003; Mitchell et al., 2007; Pickens, 2008; Parnaudeau et al., 2015). Moreover, the mediodorsal thalamus also provides a dense innervation to the medial prefrontal cortex, which is well known to support goal-directed behavior in instrumental setups (Balleine et al., 2003; Corbit et al., 2003; Coutureau et al., 2009; Tran-Tu-Yen et al., 2009). Thus, we reasoned that a different, more selective, thalamic source may support Pavlovian learning when functions of the OFC are engaged.
Therefore, to better understand thalamic connections of the OFC, the first experiment of the present study consisted of a tract-tracing approach. We demonstrated that the submedius nucleus (Sub), a poorly known thalamic area, is a major source of thalamic afferents to the OFC. Interestingly, while there is experimental evidence suggesting that the Sub and the OFC may support the modulation of ascending nociceptive information (Tang et al., 2009), no attempt has been made to investigate a possible implication of the submedius in more cognitive aspects of OFC functions. Therefore, the present study primarily aimed to document this issue. To do so we directly compared the functional consequences of specific OFC versus Sub lesions in two tasks requiring the use of stimulus–outcome (S–O) associations to guide and adapt behavior. The first task, conditional discrimination with differential outcomes, explicitly assessed whether lesioned rats were able to use sensory information about the rewards to potentiate performance. Previously, OFC lesions have indeed been shown to impair the so-called differential outcome effect (McDannald et al., 2005). However, the within-subject approach used only revealed subtle effects and, more importantly, did not identify the type of associations that were affected by the lesion in this task. Therefore, we used a more direct, between-subject approach, aimed at dissociating the specific associations used by rats to guide behavior (Blundell et al., 2001). As a second task, we used a contingency degradation procedure to assess the ability to update Pavlovian contingencies, which was previously demonstrated to be impaired by OFC lesions (Ostlund and Balleine, 2007a). Altogether, these approaches made it possible to identify the Sub as a new relevant component of a distributed circuit supporting adaptive performance when Pavlovian contingencies are altered.
Materials and Methods
Animals and housing conditions
Eighty-four male Long–Evans rats obtained from the Centre d'Elevage Janvier (France) were used (weight, 275–325 g at surgery). Rats were initially housed in pairs and accustomed to the laboratory facility for 2 weeks before the beginning of the experiments. Environmental enrichment was provided by tinted polycarbonate tubing elements, in accordance with current French (council directive 2013-118, February 1, 2013) and European (directive 2010-63, September 22, 2010, European Community) laws and policies regarding animal experiments. The facility was maintained at 21 ± 1°C with lights on from 7:00 A.M. to 7:00 P.M. Rats were tested only during the light portion of the cycle. The experimental protocols received approval #5012035-A from the Ethics Committee on December 7, 2012. Six rats were used for tracing experiments. The remaining 78 rats underwent behavioral experiments. For Experiment 1, 62 rats were divided into three lesion groups (Sham/OFC/Sub), with one of two protocols (Consistent/Inconsistent) as follows: Sham–Consistent (n = 10), Sham–Inconsistent (n = 10), OFC–Consistent (n = 10), OFC–Inconsistent (n = 10), Sub–Consistent (n = 11), Sub–Inconsistent (n = 11). Rats from these Consistent groups were also used for Experiment 2. For Experiment 3, 16 naive rats were used to assess the effect of Sub lesion in naive rats submitted to the Pavlovian degradation task (eight Sham and eight Sub rats).
Surgery
Rats were anesthetized with 4% isoflurane and placed in a stereotaxic frame with atraumatic ear bars (Kopf) in a flat skull position. Anesthesia was maintained with 1.5–2% isoflurane and complemented by subcutaneous administration of buprenorphin (Buprecare, 0.05 mg/kg).
For retrograde tracing experiments, 4% 3 kDa dextrans (Life Technologies) coupled to either fluorescein or tetramethylrhodamine were dissolved in citric acid, pH 3.0, to enhance retrograde labeling (Kaneko et al., 1996) and injected in the OFC or in the Sub. For the OFC, 0.3 μl injections targeted the ventral area of the OFC (VO) and the lateral area of the OFC (LO) in the same hemisphere as follows: LO: anteroposterior (AP), +3.7 mm from bregma; laterality, ±2.6 mm; dorsoventral (DV), −5.0 mm; VO: AP, +3.7 mm from bregma; laterality, ±1.4 mm; DV, −5.2 mm. These injections were performed in four rats and the use of a specific dextran (i.e., coupled to fluorescein or tetramethylrhodamine) was counterbalanced when targeting either orbitofrontal area. At the level of the submedius, 0.2 μl injections of a different dextran on each side (counterbalanced) were performed on two rats at the following coordinates: AP, −2.7 mm from bregma; laterality, ±0.7 mm; DV, −7.1 mm.
For behavioral studies, neurotoxic lesions were made using multiple NMDA microinjections. Twenty micrograms per microliter NMDA (Sigma-Aldrich) in artificial CSF (CMA Microdialysis) were pressure-injected into the brain through a glass micropipette (outside diameter, ∼100 μm) and polyethylene tubing (Picospritzer, General Valve). For OFC lesions, three lesion sites per side were used as follows: AP, +4.2, +3.7, and +3.2 mm from bregma; laterality, ±0.9, 2.0, and 2.8 mm; DV, −4.4, −4.5, and −5.2 mm from bregma. Each site was injected with 0.1 μl of NMDA. Neurotoxic Sub lesions were made using the same procedure, with one lesion site per side at the following coordinates: AP, −2.7 mm; laterality, ±0.7 mm; DV, −7.1 mm. Each site was injected with 0.05 μl of NMDA. In all cases, the pipette was left in place 3 min after injection before slow retraction. The Sham groups received similar surgery except that the micropipette was lowered only in the cortex and no injection was made (DV, −2.0 mm). Rats were given ≥10 d of recovery before behavioral testing or 7 d before being killed for tracing experiments.
Behavioral apparatus
Animals were trained in eight identical conditioning chambers (40 cm wide × 30 cm deep × 35 cm high; Imetronic), each located inside a sound-attenuating and light-attenuating wooden chamber (74 × 46 × 50 cm). Each chamber had a ventilation fan producing a background noise of 55 dB and four LEDs on the ceiling for illumination of chamber. Each chamber had two opaque panels, one on the right side and one on the left side; two clear Perspex walls, one on the back side and one on the front side; and a stainless-steel grid floor (rod diameter, 0.5 cm; inter-rod distance, 1.5 cm). In the middle of the left wall, a magazine (6 × 4.5 × 4.5 cm) received food pellets (45 mg; F0165, Bio-Serv) or a 20% saccharose solution (27478.296, VWR) from dispensers located outside the operant chamber. Speakers in each chamber provided the auditory stimuli, which were a 3 kHz tone or a 10 Hz clicker-train produced by the activation of a mechanical relay. The magazine was equipped with infrared cells to detect the animal's visits. A retractable lever (4 × 1 × 2 cm) could be inserted next to the magazine. A personal computer connected to the operant chambers and equipped with Poly software and interface (Imetronic) controlled the equipment and recorded the data.
Behavioral procedures
Experiment 1: conditional discrimination
The differential outcomes procedure has been described in classic experimental psychology works reporting how animals can take advantage of the sensory properties of the reward when it can serve as a supplemental discriminative cue (Trapold and Overmier, 1972). Typically, animals are required to learn a conditional discrimination task, such as pressing one of two available levers in response to a specific discriminative auditory stimulus. They are required to press the alternate lever when a second auditory stimulus is provided. In the control group, all correct responses (for any stimulus) are rewarded with the delivery of a single type of reward. In the experimental group, a distinct reward (i.e., grain pellet or saccharose solution) is provided when the animal responds correctly for each auditory cue. Faster learning in the experimental group instantiates the differential outcome effect (DOE). The classic interpretation for the DOE is that specific associations are formed between the discriminative stimuli and the sensory properties of the rewards, providing the animals in the experimental groups with an additional way to select the appropriate response (Trapold and Overmier, 1972; Blundell et al., 2001; Urcuioli, 2005). Importantly, it has previously been shown that OFC lesions impair this ability (McDannald et al., 2005).
Magazine training.
Initially, all rats were trained for one 30 min session to collect either the food pellets or the saccharose solution, which were delivered on a random time 60 s schedule.
Instrumental training.
All rats were then trained to press a lever to obtain a reward during four 30-min-long instrumental training sessions. The cage was illuminated and one of the two available levers inserted for the duration of the whole session. Presses on the lever resulted in the delivery of a reward of either type according to a pseudorandom sequence during this phase. The rats first received training for 2 d (one for each lever) under a continuous reinforcement, fixed ratio 1 schedule (FR1 i.e., each lever press was rewarded) until they had earned 50 rewards or 60 min had elapsed. In the next and final two sessions (one for each lever) of instrumental training, the rewards were delivered according to a random 30 s interval schedule (RI30; i.e., after each reward the lever had no effect for 30 s on average).
Conditional discrimination task.
The conditional discrimination task and subsequent tests were adapted from the procedure proposed by Blundell et al. (2001). Briefly, a training session consisted of eight alternating 5 min presentations of the two discriminative stimuli, the tone and the clicker (four tones and four clicks for a total duration of 40 min per training session). Both levers were available during the whole session but only one lever was reinforced during each stimulus. The other lever had no effect during this stimulus but was reinforced during the alternate stimulus. Reinforcements were delivered according to the RI30 schedule previously used for instrumental learning. In particular, at the beginning of each stimulus, the lever had no effect for a random 30 s interval. Rats received a total of 10 training sessions. Each rat was randomly assigned to the Consistent or Inconsistent group. While rats from the Consistent group received a consistent and specific outcome for each stimulus–response (S–R) association (for example, a pellet for tone–left lever and the sucrose solution for click–right lever), rats from the Inconsistent groups were rewarded with an equal probability with pellets and sucrose for each association. All conditions were fully counterbalanced in each lesion group. Presses on the correct and incorrect levers were registered during each stimulus presentation and the proportion of correct responses was then analyzed.
Test in extinction (no-reward condition).
After the 10 sessions of conditional discrimination training, a supplemental session was conducted in the exact same conditions except that no reward was delivered. Therefore, the discriminative stimuli were the only cues available to support performance during this test.
Test without stimuli (no-stimulus condition).
Following the test in extinction, a unique reacquisition session was conducted for all rats under standard conditions (rewards were again delivered). After this session, the rats were tested again but this time in the presence of the rewards but not of the stimuli. Therefore, the rewards were the only cues available to support performance during this test.
Outcome reversal.
After a second reacquisition session, a final test was conducted in rats from the Consistent group only. This test consisted in reversing the outcomes (i.e., the rewards) for each S–R association. That is, each stimulus continued to signal the same correct response, but this S–R association was now reinforced following an opposite pattern (e.g., S1–R1 previously reinforced by sucrose was now reinforced with grain). A direct prediction of the differential outcome effect theory is that such manipulation should considerably impair performance even though the initially learned S–R associations remain valid, showing that performance can be largely governed by the reward itself when it can serve as a predictive cue (Urcuioli, 2005).
Experiment 2: Pavlovian contingency degradation
At the end of Experiment 1, all rats from the Consistent groups underwent additional behavioral testing to investigate whether any of the lesions would interfere with the updating of Pavlovian contingencies. To do so, an initial Pavlovian conditioning phase was conducted, during which rats learned to associate each of the two auditory cues previously experienced with the delivery of one of the outcomes in the magazine. Once the Pavlovian associations were reliably established, the contingency between one of the conditional stimuli (CSs) and its outcome was selectively degraded by providing random rewards with similar probability throughout the session. The animals could thus learn that this particular stimulus was no longer a reliable predictor of the reward as there was an equal probability to receive it during the CS or during the intertrial intervals (ITIs).
Pavlovian conditioning.
Rats received a total of four sessions of Pavlovian training, during which they learned that each predictive auditory cue was associated with the delivery of a particular outcome; that is, the reward that was associated with this stimulus during the last phase of Experiment 1 (Outcome reversal). Each training session consisted of successive presentations of only one of the auditory predictive cues so that the tone was used for two sessions and the clicker was used for two sessions. For each session, there were 20 successive presentations of a given auditory cue with an average ITI of 70 s. Each stimulus presentation lasted 20 s, during which the corresponding outcome was delivered pseudorandomly from zero to four times (average, 1.5 times), for a total of 30 rewards per daily session.
Pavlovian contingency degradation.
Following Pavlovian conditioning, all rats were given eight supplemental daily sessions. These sessions alternated between standard sessions following exactly the same schedule as during initial Pavlovian conditioning and sessions during which the contingency between the auditory cue and the associated outcome was degraded. For each rat, only one S–O association was degraded, the other S–O association being presented in a standard session on alternate days. For the degraded schedule, 30 rewards were distributed throughout the session, both during the stimulus and the ITI, so that the probability of receiving the reward was equal during these two periods. Moreover, care was taken to match as closely as possible the distribution of inter-reward intervals under the nondegraded and degraded conditions. All S–O associations and associated contingency schedules (degraded vs nondegraded) were counterbalanced across rats and lesion groups.
Test without rewards (in extinction).
After the last session of contingency degradation, rats underwent a final test under extinction conditions. This test consisted of four alternating, 1 min presentations of the stimuli, separated by a fixed 1 min ITI (total session duration, 9 min). No reward was delivered during this test. The critical measure for this test was the rate of visits to the magazine during the first 20 s for each stimulus.
Experiment 3: Pavlovian contingency degradation in naive rats
To examine the extent to which prior training in the DOE task may have influenced Pavlovian responding and to confirm the selective effect of Sub lesion on Pavlovian processes, a new cohort of naive rats comprising both Sham and Sub animals was submitted to the Pavlovian degradation task. The only differences were that Pavlovian training was given for six consecutive daily sessions (instead of two in Experiment 2) and degradation was assessed for 10 consecutive sessions (instead of four in Experiment 2). Extending these periods was necessary to establish asymptotical performance in Sham rats, thus suggesting that prior DOE training had indeed facilitated Pavlovian conditioning. All other aspects of the task followed exactly the procedures described above for Experiment 2.
Histology
Animals received a lethal dose of sodium pentobarbital and were perfused transcardially with 150 ml of saline followed by 400 ml of 10% formalin. The sections throughout the OFC and the Sub regions were collected onto gelatin-coated slides and dried before being stained with thionine. Histological analysis of the lesions was performed under the microscope by two experimenters (M.W. and F.A.) blind to lesion conditions.
Dextrans revelation
For tracing experiments, rats were perfused transcardially with 150 ml of saline followed by 400 ml of 4% paraformaldehyde (PFA). Brains were kept in the same PFA solution overnight, then sections of 40 μm of the prefrontal cortex and the thalamus were made using a vibratome. Immunochemistry was performed on the sections to enhance the dextran staining. First, sections were rinsed in PBS 0.1 m (5 × 5 min), and then incubated in a blocking solution for 1 h (4% donkey serum and 0.2% Triton X-100 in PBS 0.1 m). Immediately after, sections were put in a bath containing primary antibodies, rabbit anti-tetramethylrhodamine (A-6397, Life Technologies) and goat anti-fluoresceine (A-11096, Life Technologies), which were diluted at 1:1000 and 1:200 respectively in the blocking solution for incubation at 4°C for 48 h. Sections were then rinsed in PBS 0.1 m (4 × 5 min) and placed for 2 h in a bath containing a donkey anti-goat secondary antibody coupled to fluorescein (1:200 in PBS 0.1 m; 705-546-147, Jackson ImmunoResearch). After further rinses in PBS 0.1 m (4 × 5 min), the slices were incubated with the other secondary antibody, goat anti-rabbit coupled to tetramethylrhodamine (1:200 in PBS 0.1 m; 111-025-003, Jackson ImmunoResearch) for 2 h. Following four 5 min rinses in PBS 0.1 m, Hoechst solution (B2883, Sigma-Aldrich) for counterstaining was added for 15 min (1:5000 in PBS 0.1 m). Finally, sections were rinsed in PB 0.1 m (4 × 5 min), mounted in PB 0.05 m onto gelatin-coated slides, and coverslipped with the antifading reagent Fluoromount G (0100-01, SouthernBiotech). Images were then captured using a Nanozoomer slide scanner (Hamamatsu Photonics) and analyzed with the NDP.view freeware (Hamamatsu Photonics).
Data analysis
The data were submitted to ANOVAs on StatView software (SAS Institute) with Lesion (Sham/OFC/Sub) and Procedure (Consistent/Inconsistent) as between-subject factors, Period (ITI, CS) or Condition (Degraded/Nondegraded), and Session or Two-session Blocks as repeated measures when appropriate. For the Pavlovian conditioning experiment, the dependent measure was the average frequency of magazine visits before any pellet delivery during any stimulus presentation. The α value for rejection of the null hypothesis was 0.05 throughout.
Results
Tracing experiments
The cortical injections of the retrograde tracers at the level of the orbitofrontal region produced intense labeling at the level of the thalamus, as shown in Figure 1A. Two main thalamic loci were particularly prominent. These corresponded to the mediodorsal thalamus and the submedius. Injections to both the LO and VO resulted in consistent and intense labeling of ipsilateral thalamic cells in the submedius of the four rats examined, with a topographic organization that followed an anteroposterior gradient. The most rostral part of the submedius appeared to innervate both the VO and the LO, with a relative segregation within this thalamic subregion. While the ventrolateral submedius appeared to project preferentially to the LO, the dorsomedial submedius provided substantial afferents to the VO. Farther back on the anteroposterior axis, the submedius seemed to innervate essentially the LO. Altogether, thalamocortical afferents from the submedius to the OFC appeared to predominantly innervate the LO, ipsilaterally.
Regarding the injections of the same retrograde tracer at the thalamic level, an intense labeling was present mostly in the ipsilateral OFC (Fig. 1B). In particular, cell bodies were present in the dorsolateral parts of the orbitofrontal cortex, the LO, and the VO. A few cells were also visible in the insular and somatosensory cortex. In conclusion, we demonstrate strong reciprocal connections between the submedius and the orbitofrontal regions.
Histology
The OFC and Sub lesions are shown in Figure 2. In general, OFC lesions were comparable to those previously reported in the literature using the same coordinates and volumes of injections (Rudebeck et al., 2007) and typically consisted of extensive damage to the whole extent of the ventrolateral orbital region, while the medial prefrontal cortex was essentially unaffected (Fig. 2A). One OFC rat was discarded as it exhibited only minimal damage, which was too anterior. In most cases, the small injections at the level of the Submedius produced the expected specific bilateral damage with only moderate damage to the reuniens/rhomboid complex that lies just in between the bilateral Subs (Fig. 2B). In a few cases (n = 5), substantial damage to this region was also observed, together with additional encroachment to the centromedial, and even the mediodorsal, thalamic nuclei in the worst case (unilaterally). But these animals were not behaviorally different from the ones without such damage in all tasks examined. However, three experimental Sub rats were discarded from behavioral analyses because the lesion was too posterior, leaving the submedius essentially intact. In addition, one Sham rat became sick after surgery and had to be killed. The thin glass micropipette caused no detectable mechanical injury to either the cortex or the thalamus. The final groups for Experiments 1 and 2 were therefore as follows: Sham: n = 19; Consistent group, 9; Inconsistent group, 10; OFC: n = 19; Consistent group, 10; Inconsistent group, 9; Sub: n = 19; Consistent group, 9; Inconsistent group, 10. For Experiment 3, all Sub lesions were sufficiently accurate and highly comparable to those included in Experiments 1 and 2 (Sham, n = 8; Sub, n = 8).
Experiment 1: conditional discrimination
Acquisition of the conditional discrimination task is shown in Figure 3A–C for Sham, OFC, and Sub rats, respectively. During acquisition of the task, the main observation was that animals trained with the consistent procedure achieved much higher performance than animals from the Inconsistent groups, regardless of lesion status. Gradual learning of the conditional task was confirmed by a highly significant effect of session Block (F(4,204) = 53.00; p < 0.0001). The better performance in the Consistent groups was confirmed by a significant effect of Procedure (F(1,51) = 59.87, p < 0.0001) and a significant Block × Procedure interaction (F(4,204) = 18.80, p < 0.0001). The main effect of Lesion failed to reach significance (F(2,51) = 2.57, p = 0.0869) as did any interaction with the Lesion factor [Lesion × Procedure (F < 1); Lesion × Block (F(8,204) = 1.34, p = 0.2248); Lesion × Block × Procedure (F(8,204) = 1.38, p = 0.2072)], indicating that both lesions entirely spared performance during acquisition of the task. This was observed in both the Consistent (Lesion, F(2,25) = 1.35, p = 0.2779; Lesion × Block, F < 1) and the Inconsistent groups [Lesion, F(2,26) = 2.50, p = 0.1009; Lesion × Block (F(8,104) = 1.74, p = 0.0979)]. The significant effect of Block for both procedures however confirmed that all rats gradually improved over time (Consistent: Block, F(4,100) = 81.46, p < 0.0001); Inconsistent: Block, F(4,104) = 4.24, p = 0.0032) although less convincingly for the Inconsistent groups. Even in the latter case, however, the Lesion × Session interaction failed to reach significance (Consistent, F < 1; Inconsistent, F(8,104) = 1.74, p = 0.0979). Furthermore, the total numbers of presses on the correct or on the incorrect levers were both similar for each group (Correct lever: Lesion, F(2,51) = 1.10, p = 0.3389; Incorrect lever: Lesion, F < 1). Acquisition of the conditional discrimination task therefore revealed strong evidence of the differential outcome effect (Fig. 3, gray area) in all rats under circumstances where both the stimuli and the outcomes may support performance. The next step was to selectively assess the ability to use either only the stimuli (test without outcome, in extinction) or only the outcomes (test without stimulus) to guide behavior.
Test without outcomes (in extinction)
The test conducted in extinction follows the exact same procedures as those used during acquisition except that no rewards were delivered, i.e., both levers were inactive during this test. Performance of all groups during this single extinction session is plotted on Figure 4A. Again, the main observation was that rats previously trained with the consistent procedure maintained much better performances than rats from the Inconsistent groups, even when no feedback was provided regarding the choices made by the animals. This major observation was confirmed by a highly significant effect of Procedure (F(1,51) = 88.39, p < 0.0001). Interestingly, the effect of Lesion now reached significance (F(2,51) = 5.0, p = 0.0104) and a Fisher post hoc test confirmed that both the OFC and the Sub groups exhibited slightly reduced performance in this test (p's < 0.05), without differing from each other (p = 0.88). However the Lesion × Procedure interaction did not reach significance (F(2,51) = 2.05, p = 0.1394), indicating that OFC and Sub rats exhibited slightly worse performance than Sham rats in the absence of rewards, regardless of the protocol used.
Test without stimuli
After a single reacquisition session conducted under standard conditions (stimuli and outcome present), an additional test was proposed, during which there was no stimuli, but pressing an arbitrarily defined correct lever was rewarded with the appropriate outcome. The correct lever alternated every 5 min exactly as if the stimulus was present. The animals could therefore only rely on the outcome to guide their behavior. The performance of the different groups of rats during this test is shown in Figure 4B. While the omission of the stimuli produced a considerable reduction of performance for most groups, it was evident that rats trained with the consistent protocol continued to maintain superior performance (Procedure, F(1,51) = 77.44, p < 0.0001). Again, the main effect of Lesion was significant (F(2,51) = 3.58, p = 0.0352) and the post hoc Fisher test indicated that OFC rats outperformed both the Sham and the Sub groups (p's < 0.05), which did not differ from each other (p = 0.91). Furthermore, the Lesion × Procedure interaction was close to significance (F(2,51) = 3.02, p = 0.0576), suggesting that the superior performance exhibited by OFC rats in this test was more evident when assessed with the consistent procedure. This latter possibility was further supported by a significant effect of Lesion for the consistent procedure only (Consistent: F(2,25) = 4.32, p = 0.0244; Inconsistent: F < 1), confirming superior performance of OFC rats over both Sham and Sub rats in this instance (both p's < 0.05).
Outcome reversal
After a final reacquisition session, a final test was conducted, during which the specific reward delivered when rats responded correctly to the current stimulus was reversed. That is, the learned S–R associations were maintained but the associated outcomes were reversed. Therefore, this test could only be conducted in the Consistent groups. To provide a more comprehensive view of this manipulation on the level of performance, Figure 4C compares the performance for the Consistent groups in outcome reversal with the immediately preceding reacquisition session. The most striking effect was the dramatic performance drop exhibited by all groups (Session: F(1,25) = 351.03, p < 0.0001). There was a moderate but significant effect of the Lesion (F(2,25) = 4.15, p = 0.0278) in this instance, with OFC rats showing slightly reduced performance during these final sessions when compared with Sham (p = 0.0105) and, to a lesser extent, to Sub rats (p = 0.0553). This deficit was, however, not specific to the outcome reversal as the Lesion × Session interaction was not significant (F < 1).
Comparison of the performance of the consistent groups across the three tests
To provide a more comprehensive overview of the effect of OFC and Sub lesions on the different tests examined, we performed a specific analysis for the Consistent groups only, where the performance during each test was expressed relative to that of the last session of acquisition. Thus we could treat tests as repeated measures, and focus on the critical Lesion × Test interactions. This analysis is represented in Figure 5, which shows how the performance of Sham and lesioned rats varied as a function of the test examined. While the performance of Sham and Sub rats seemed to follow the same pattern of variations, with better performance during the extinction test, OFC rats expressed marked differences as was particularly evident during the test without stimuli, during which their performance was essentially unaffected and considerably superior to that of the other two groups. This observation was supported by a highly significant effect of Test (F(2,50) = 24.40, p < 0.0001), and, importantly, by the Test × Lesion interaction (F(4,50) = 3.87, p = 0.0082), while the main effect of Lesion was not significant (F(2,25) = 1.16, p = 0.3301). Specific analyses confirmed the existence of a Lesion effect for the test without stimuli (F(2,25) = 10.72, p = 0.0004), where OFC rats outperformed both the Sham and the Sub groups (p's < 0.01). Similar analyses run on the other two tests did not add any further information nor significant effect (F's < 1).
The main findings derived from Experiment 1 were therefore that all rats exhibited strong differential outcome effect throughout all phases of testing, even if performance was impaired in both lesioned groups during the test in extinction that assessed the ability to use stimuli as predictive cues. Interestingly, while the performance of Sham and Sub rats appeared to similarly rely on the use of the stimuli, OFC rats appeared remarkably efficient, even when the stimuli were not available, suggesting that they favor the use of outcome over the use of stimuli to guide their behavior.
Experiment 2: Pavlovian contingency degradation
Pavlovian training
During the initial Pavlovian training, all rats appeared to learn the Pavlovian contingencies similarly (data not shown). Learning was evidenced by a higher rate of visits to the magazine during the stimulus prior reward delivery as opposed to before the stimulus, as indicated by the highly significant effect of Period (F(1,23) = 86.67, p < 0.0001) as well as the significant Session × Period interaction (F(1,23) = 11.10, p = 0.0029). The lack of effect of Lesion (F < 1) or any interaction with this factor (Session × Lesion, Period × Lesion, F's < 1; Session × Period × Lesion, F(2,23) = 1.80, p = 0.1882) confirmed similar Pavlovian conditioning in all rats. Moreover, the baseline, corresponding to the last session of training, was identical among the three groups (Lesion, F < 1).
Contingency degradation
Figure 6A shows the rate of visits to the magazine during the CS, relative to the corresponding baseline during Pavlovian training, for the cues with either degraded or nondegraded predictive value. This figure clearly shows that Sham rats visited the magazine at a much higher rate when the Pavlovian contingency was not degraded. The critical findings were, however, that neither the OFC nor the Sub rats expressed such a differential behavior as they maintain similar visit rates to the magazine for both CSs, including the one that was no longer a reliable predictor. The main effect of Lesion was not significant (F < 1), indicating that all rats were similar in terms of the overall number of magazine visits during the CS. However, the existence of a differential behavior in Sham and lesioned rats was confirmed by a significant effect of Degradation (F(1,23) = 14.55, p = 0.0009) and, critically, by the significant Lesion × Degradation interaction (F(2,23) = 4.54, p = 0.0219). The remaining interactions did not reach significance. Separate analyses run on the different lesioned groups confirmed that Sham (Degradation: F(1,6) = 11.93, p = 0.0136) but not OFC nor Sub rats (F(1,9) = 1.76, p = 0.2178; F(1,8) = 1.09, p = 0.3279, respectively) correctly adapted their response to the new Pavlovian contingencies.
Test without rewards (in extinction)
Figure 6B displays the rate of visits to the magazine during the first 20 seconds of the CS for the cues with either degraded or nondegraded predictive value. The most striking observation was that differential behavior in Sham but not OFC nor Sub rats was maintained even in the absence of the rewards. Thus, all statistical analyses were highly consistent with those run when the rewards were available. The main effect of Lesion was not significant (F < 1). The main effect of Degradation also failed to reach significance (F(1,23) = 1.40, p = 0.2482), but, of prime importance, there was a significant Degradation × Lesion interaction (F(2,23) = 3.60, p = 0.0436). Separate analyses confirmed that only Sham rats appeared to modify their behavior according to the new Pavlovian contingencies (Degradation: F(1,6) = 9.80, p = 0.0203), while both the OFC and the Sub rats failed to do so (Degradation: F's < 1).
Altogether, these results therefore consistently point to an inability in both OFC and Sub animals to adapt to altered Pavlovian contingencies between a predictive cue and an outcome. However, extensive prior training in the DOE task may have influenced associative processes at play during the different phases of the Pavlovian task. To address this issue, we conducted a third experiment in naive Sham and Sub rats.
Experiment 3: Pavlovian contingency degradation in naive rats
Pavlovian training
During the initial Pavlovian training, both Sham and Sub rats appeared to learn the Pavlovian contingencies similarly (data not shown). Learning was evidenced by a progressively higher rate of visits to the magazine across training during the stimulus (before reward delivery) as opposed to before the stimulus, as indicated by the highly significant effect of Period (F(1,14) = 104.47, p < 0.0001) and Block (F(2,28) = 4.82, p = 0.0159), as well as the significant Block × Period interaction (F(2,28) = 5.48, p = 0.0098). All rats were found to learn the task at the same rate as neither the main effect of Lesion (F(1,14) = 1.33, p = 0.2677) nor the Period × Lesion (F(1,14) = 2.04, p = 0.1748) or Block × Lesion (F < 1) interaction reached significance. Moreover, the baseline, corresponding to the last session of training, was identical between the two groups (Lesion, F < 1). Thus, Sub lesions had no effect on the acquisition of Pavlovian conditioning in naive animals.
Contingency degradation
Figure 7A shows the rate of visits to the magazine during the CS, relative to the corresponding baseline during Pavlovian training, for the cues with either degraded or nondegraded predictive value. Only Sham rats exhibited adaptive responding, with progressively reduced responding only for the stimulus corresponding to the degraded Pavlovian association. Again, Sub rats did not express such differential behavior as they maintained similar visit rates to the magazine for both CSs, including the one that was no longer a reliable predictor. The main effect of Lesion was not significant (F < 1), indicating that all rats were similar in terms of the overall number of magazine visits during the CS. However, the existence of a differential behavior in Sham and Sub rats was confirmed by a significant Lesion × Degradation interaction (F(1,14) = 6.88, p = 0.0201). Separate analyses run on the two groups confirmed that Sham (Degradation: F(1,7) = 5.97, p = 0.0446) but not Sub rats (F(1,7) = 1.43, p = 0.2703) correctly adapted their response to the new Pavlovian contingencies.
Test without rewards (in extinction)
Figure 7B displays the rate of visits to the magazine during the first 20 seconds of the CS for the cues with either degraded or nondegraded predictive value. The most striking observation was that differential behavior in Sham but not Sub rats was maintained even in the absence of the rewards. The main effect of Lesion was not significant (F(1,14) = 1.94, p = 0.1857), while both the main effect of Degradation (F(1,14) = 3.56, p = 0.0800), as well as the Degradation × Lesion interaction (F(1,47) = 3.89, p = 0.0685) approached significance. Separate analyses confirmed that only Sham rats appeared to modify their behavior according to the new Pavlovian contingencies (Degradation: F(1,7) = 7.22, p = 0.0313), while Sub rats failed to do so (Degradation: F's < 1).
Altogether, these data obtained in a separate cohort of rats without prior training in the DOE task confirmed the detrimental effect of Sub lesions on the ability to adapt to altered Pavlovian contingencies between a predictive cue and an outcome.
Discussion
Altogether, the present data unambiguously identify the Sub as a prominent thalamic innervation to the OFC, functionally relevant for the flexible use of Pavlovian contingencies. Moreover, we provide a detailed analysis of the effect of OFC lesions in the DOE task. While lesions of the OFC did not impede the establishment of the DOE, all data indicated that OFC rats relied less on the discriminative stimuli and more on the outcome, consistent with the view that S–O associations are fundamentally supported by this region. By contrast, the additional recruitment of the submedius appears to be necessary only when these associations require updating.
Most if not all cortical areas can be defined by their specific thalamic afferences. Concerning the prefrontal region, the OFC was initially defined as a cortical region whose prominent thalamic projections arose from the mediodorsal thalamus (Rose and Woolsey, 1948). There is still much interest in understanding the specific functional connectivity between these two regions in both rodents and humans (Klein et al., 2010; Jakab et al., 2012; Ewing et al., 2013), but this has somehow obscured the existence of the thalamic innervation originating from the submedius. Our own observations confirm earlier reports that the submedius provides a dense and reciprocal innervation to the ventrolateral orbitofrontal cortex (Coffield et al., 1992; Yoshida et al., 1992). Still, this innervation was left essentially unaddressed in terms of cognitive functions outside the domain of nociception (Tang et al., 2009). The present study therefore provides original insight on the functional relevance of these connections in the light of our current understanding on orbitofrontal functions.
Both the DOE task and Pavlovian contingency degradation are thought to capture the gist of orbitofrontal functions. These tasks were previously shown to be impaired by OFC lesions (McDannald et al., 2005; Ostlund and Balleine, 2007a). During acquisition of the conditional discrimination task, all experimental groups showed the DOE. In addition, there was no indication of any deficit during this phase. This lack of effect of OFC lesions appears to contradict a previous study reporting that OFC lesions impair DOE learning (McDannald et al., 2005). However, in the within-subject approach used by these authors, there was no evidence of a general deficit in performance, which is consistent with the present data, and only a careful analysis of the nature of errors in the task distinguished OFC rats from control rats. While the DOE task may not be a critical test of OFC functions, it provides a powerful way to explicitly test each of the multiple associations that may guide behavior during DOE training.
Trapold was probably the first to report that conditional discriminations based on predictive cues are learned more rapidly when distinct rewards are consistently provided for each association (Trapold, 1970; Trapold and Overmier, 1972). Of prime interest is the distinction between S–R strategies that disregard outcome identity and strategies relying on specific S–O associations that may constitute the core of OFC functions (Delamater, 2007; Ostlund and Balleine, 2007b; Balleine et al., 2011). Clearly, the importance of the DOE effect, as well as the prominent drop of performance during the reversal test in all consistent groups, is a strong indication that the rats exploited the specific nature of the outcome to perform the task. The classic interpretation posits that S–O associations between the stimulus and the sensory properties of the reward are learned by the subject and provide an outcome expectancy [E(O)] that supplements the explicit predictive cue, enabling thus to potentiate performance (Urcuioli, 2005). Superior performance in the Consistent groups may therefore result from the integration of S–O and outcome–reward (O–R) associations via E(O) to select the correct response. However, when the outcome is physically present as during training or during the test without stimuli, specific O–R associations alone may be sufficient to support performance. This is why the subsequent tests are particularly revealing as to the nature of deficits presented by OFC animals.
These tests suggest that Sham and OFC rats were primarily relying on different associations to support efficient performance. OFC rats coped better than Sham rats with the removal of the stimuli, while in the test conducted without the rewards, they showed a clear impairment. Thus, OFC rats may not primarily be using S–O associations but rather the reward itself and O–R associations to perform the task. By contrast, Sham and Sub rats appear to heavily rely on S–O associations. This view is consistent with a large body of data indicating that the role of the OFC becomes prominent when the use of S–O associations is critical for successful performance (Delamater, 2007; Ostlund and Balleine, 2007b; Balleine et al., 2011). More specifically, the OFC appears particularly critical when the information delivered by predictive cues is integrated to produce expectancies or imagine future outcomes (Schoenbaum et al., 2011; Takahashi et al., 2013). In contrast, Sub rats exhibited a similar behavioral profile than Sham rats throughout the different phases of the task, although there was a modest impairment when the rewards were removed, suggesting that contrary to the OFC, this region may not be critical for the acquisition and use of S–O associations. However, while the second experiment confirmed a prominent deficit in OFC rats in the ability to adapt to changed Pavlovian contingencies, which is consistent with an earlier report (Ostlund and Balleine, 2007a), the most remarkable result was the severe deficit exhibited by Sub rats at this occasion, also confirmed in naive rats in Experiment 3. The behavior exhibited by Sham rats for Experiments 2 and 3 strongly suggests that prior training in the DOE task facilitated the subsequent establishment of Pavlovian associations as it was necessary to prolong both the initial training and the degradation phase to see the development of optimal performance in Experiment 3, without prior DOE training. Nonetheless, Sub lesions produced a strikingly similar detrimental effect on both occasions, totally sparing initial training while preventing adaptive responding during the degradation phase.
Thus, both the OFC and the Sub may be required for flexible outcome-guided behavior, a feature previously conceived as largely dependent on the OFC alone (McDannald et al., 2014). This result is of great importance as it introduces a functionally relevant thalamic stage in a circuit encompassing cortical (i.e., the OFC) and temporal (i.e., the BLA) structures concerned with the use of outcomes and values (Balleine and Killcross, 2006; Balleine et al., 2011; Parkes and Balleine, 2013).
This important result echoes recent findings showing an involvement of the parafascicular thalamic nucleus when animals were required to express flexibility after the establishment of initial learning (Bradfield et al., 2013a). Thus, the functional significance of the innervation provided by the Sub to the orbitofrontal region may be more evident when relevant information needs updating. While the present results are by themselves sufficient to identify the Sub as a critical locus for adaptive responding in the context of a Pavlovian degradation task, a disconnection procedure might address more specifically the functional interactions between the Sub and the OFC. Assessing the generality of the involvement of the Sub in adaptive processes appears as a promising avenue. This could be achieved by classical assessment of behavioral flexibility, such as response reversal learning or, alternatively, by assessment of “overexpectation” (Rescorla, 2007; Takahashi et al., 2009). This latter option appears particularly appealing as, in this situation, the subject expects a quantitatively better outcome than the actual reward, which contrasts with the degradation task, where decreased responding reflects lower expectancy of the reward as a result of the degradation procedure. Interestingly, recent views emphasize an involvement of the thalamus in the circuits that are necessary to signal discrepancies between predicted and actual outcomes (Ullsperger et al., 2014; Chase et al., 2015). Based on the present data, it is not possible to distinguish whether the thalamic submedius may be involved in monitoring outcomes or whether it more directly supports the adaptive response itself. Clarifying the relative functional contributions of the cortical and thalamic areas in these processes certainly represents a major objective for future studies.
In conclusion, we were able to partially dissociate the contribution of the OFC and the thalamic submedius in the use of learned Pavlovian associations. Understanding the functional connectivity between specific thalamic areas and their related temporocortical circuits represents a major issue for the coming years (Wolff et al., 2015), with considerable relevance for many pathological conditions, such as schizophrenia (Anticevic et al., 2014) and addiction (Balleine et al., 2014).
Footnotes
This work was supported by a grant from the French agency for research, Agence Nationale pour la Recherche Thalame. The microscopy was done in the Bordeaux Imaging Centre, a service unit of the Centre National de la Recherche Scientifique–Institut National de la Santé et de la Recherche Médicale, and Bordeaux University, member of the France BioImaging national infrastructure, with help from Christel Poujol and Sébastien Marais. We also thank Yoan Salafranque for animal care.
The authors declare no competing financial interests.
- Correspondence should be addressed to Mathieu Wolff, Institut de Neurosciences Cognitives et Intégratives d'Aquitaines (INCIA), UMR 5287, CNRS/Université de Bordeaux Université de Bordeaux-Site Carreire, BP31 146 rue Léo Saignat, 33076 Bordeaux cedex, France. mathieu.wolff{at}u-bordeaux.fr