Animals rely on environmental cues to identify potential rewards and select the best reward available. The orbitofrontal cortex (OFC) is proposed to encode sensory-specific representations of expected outcome. However, its contribution to the selection of a preferred outcome among different reward options is still unclear. We investigated the effect of transient OFC inactivation (achieved by presession injection of muscimol and baclofen) in a novel two-reward choice task. In discrete trials, rats could choose between a solution of polycose and an equally caloric, but highly preferred, solution of sucrose by visiting one of two liquid dispensers after the presentation of a specific cue signaling the availability of one or both of the solutions. We found that OFC inactivation did not affect outcome preference: rats maintained high preference for sucrose and adapted their behavioral responding when the cue–outcome contingencies were reversed. However, when rats were tested drug-free 24 h after OFC inactivation and reversal learning, memory for the newly learned contingencies was poor. These results suggest a potential conflict between OFC (encoding pre-reversal contingencies) and other brain circuits (encoding the new contingencies). Remarkably, repeating the OFC inactivation before the reversal memory test restored normal behavior, confirming the hypothesis of a dominant impact of OFC on other decision-making circuits. These results indicate that the representations encoded in the OFC, while not essential to the expression of outcome preference, exert hierarchical control on downstream decision-making circuits.
Neural activity in the orbitofrontal cortex (OFC) has been shown to correlate with the anticipation of a pending outcome (Murray et al., 2007). This prediction signal is thought to serve multiple behavioral functions. First, by signaling the potential outcome of a choice, the OFC can participate in the decision-making process (Ostlund and Balleine, 2007; Young and Shapiro, 2011). Second, these outcome-related signals may be essential for the interpretation of the actual choice outcome, such that the greater the discrepancy between the predicted and the actual outcome, the more that is learned about the environment and/or the choice made (Schoenbaum et al., 2009).
Importantly, outcome prediction signals are not restricted to the OFC and are found in several other brain regions, including the amygdala and striatum (Paton et al., 2006; Roesch et al., 2010). It has been argued that these different brain areas encode different properties of an expected outcome. Consistent with a model-based reinforcement system, the OFC signals the specific identity of an outcome by encoding specific sensory properties, such as flavor and texture (Rolls and Grabenhorst, 2008). In contrast, the ventral striatum has been argued to encode the value of different outcomes in a common currency, consistent with a role in model-free reinforcement system (Maia, 2009; Roesch et al., 2009), although a contribution in model-based learning has also been suggested (Daw et al., 2011; McDannald et al., 2012). In agreement with the above ideas, OFC lesions impair rats' ability to select an action based on outcome identity, or to learn about changes in outcome identity (Ostlund and Balleine, 2007; McDannald et al., 2011).
In an effort to identify the neural bases of identity or value encoding, previous studies have manipulated these two properties independently (McDannald et al., 2011). However, value and identity are often not independent. The value of an outcome is often determined precisely by its sensory properties. For example, for equal calories, one outcome might be preferred to another because of its distinctive flavor. The OFC has been shown to signal the specific identity of different possible outcomes, but it is not clear whether it is involved in the expression of identity-based preferences.
The objective of this work was to clarify the role of OFC in behaviors driven by the preferred identity of an outcome, in a novel two-reward choice procedure. Our findings indicate that rats maintain cued responding for a preferred reward when the OFC is inactivated, and can adapt their behavioral responding when the contingencies that signal the preferred reward are reversed. However, after reversal learning, when the OFC is again functional, memory for the reversal is poor. Taking the OFC off-line once more before a reversal memory test restores normal behavior, indicating that poor reversal memory 24 h after OFC inactivation resulted from a conflict between multiple memory systems. Thus, while other structures can compensate for OFC inactivation, the OFC plays a prominent role in the processing of environmental cues to select a favored outcome, and exerts hierarchical control on this choice behavior.
Materials and Methods
One hundred and twelve adult male Long–Evans rats (250–300 g on arrival; Harlan) were used. Rats were individually housed in ventilated Plexiglas cages in a temperature-regulated (21 ± 1°C) and light-regulated (12 h light/dark cycle, lights on at 7:00 A.M.) colony room. After a week of acclimation to the colony room, feeding was restricted to maintain weights at ∼90% of ad libitum feeding weights. All procedures were approved by the Institutional Animal Care and Use Committee of the Ernest Gallo Clinic and Research Center.
Behavioral testing was conducted in eight identical conditioning chambers (30.5 × 32.5 × 29.5 cm; Med Associates) individually enclosed in ventilated sound-attenuating wooden cubicles. Each conditioning chamber was fitted with a recessed nose-poke operandum on the back wall, and two fluid receptacles located in two separate recessed ports on the front wall. The two fluid receptacles were separated by 16.5 cm and were equidistant from the nose-poke operandum. A cue light (28 V, 100 mA) was located above each fluid receptacle and oriented to illuminate the inside of the port. Three cue lights (green, yellow, and red LEDs) were located inside the nose-poke operandum. Each fluid receptacle was connected via polyethylene tubing to a syringe containing either sucrose or polycose solution. The delivery of these solutions was controlled by two syringe pumps (one pump for each port/reward). Nose-poke responses and port entries were detected by photo-beam interruptions. A computer equipped with Med-PC software (Med Associates) controlled all experimental programs and data collection.
Over 6 d, rats were trained to associate the illumination of a port (either on the left or right side of the front panel) with the availability of a specific reward in that port (either 15% sucrose or 15% polycose). Each session lasted 2 h and consisted of 48 presentations of either the polycose cue (days 1, 3, 5) or the sucrose cue (days 2, 4, 6). Each cue was presented on average every 2.5 min (±1 min) and lasted 30 s. During the presentation of a cue, the associated reward was available on a variable interval 10 s (±3 s) schedule. Rats had to visit the appropriate port or maintain presence in that port to trigger the delivery of 0.1 ml of sucrose or polycose solution in that port (delivered in 3 s).
Training (cued choice task).
Rats were trained to initiate each trial by making a nose poke on the back panel. Onset of three LED lights inside the nose-poke operandum signaled that a trial could be initiated. A nose-poke response would then switch the LED lights off and trigger the presentation 0.5 s later of either the sucrose cue (illumination of the sucrose port; 50% of the trials) or the polycose cue (illumination of the polycose port; 50% of the trials) on the front panel. This nose-poke requirement for trial initiation ensured that rats were engaged in the task at the time of cue presentation and started each trial in a position equidistant from both ports/cues. During a trial, the cue was presented for a maximum of 15 s, during which rats had to visit the appropriate port to obtain a reward (delivered in that port 0.5 s after port entry). A reward consisted of 0.1 ml of either 15% sucrose or 15% polycose, depending on the cue presented (delivered over 3 s). A trial ended 2 s after the end of the reward delivery (which corresponds to the estimated time required to consume the reward), or immediately after a visit in the incorrect port, or after 15 s without a response (omission). At the end of a trial, the reward cue was switched off and the nose poke was illuminated again, signaling the opportunity to initiate a new trial. Trial order was randomly determined with no more than four consecutive repetitions of the same trial type (i.e., sucrose or polycose). Sessions ended after 2 h or after the rats completed 120 trials.
Test (free choice task).
Rats previously trained to associate the two different cues with the two different outcomes, were given the opportunity to choose between these two outcomes (Fig. 1). At the concentration used (15% sucrose and 15% polycose), the two solutions are equally caloric, but sucrose is highly preferred over polycose (Sclafani et al., 1987). As in training sessions, rats were required to initiate each trial by making a nose poke. Following a nose poke, the cues were presented either individually (as in training) or simultaneously (except for Experiment 1b, in which the cues were only presented simultaneously). When a cue was presented individually, rats had 15 s to visit the appropriate port to obtain 0.1 ml of the corresponding reward (delivered 0.5 s after port entry). These trials are referred to as “cued choice” and represented two-thirds of the trials (one-third sucrose cue; one-third polycose cue). When both cues were presented simultaneously, rats had 15 s to visit one of the two ports to obtain 0.1 ml of the chosen reward (except for Experiment 1b, in which the responses were not reinforced). These trials are referred to as “free choice” and represented one-third of the trials. A trial ended 2 s after the delivery of a solution, after a failure to respond in 15 s, or after a visit in the incorrect port in the cued choice trials. The order of the trials was randomly determined with a maximum of four consecutive repetitions of the same trial type. The session ended after 2 h or after the rats completed 120 trials. Each rat experienced three test sessions before the beginning of any experimental manipulation (one presurgery and two postsurgery) to establish baseline outcome preference. These baseline test sessions were interspersed with three training sessions (which consisted of cued choice trials exclusively) to prevent the potential development of a strategy based purely on response preference (rats always choosing the sucrose port, regardless of the cue presented) and instead encourage rats to consider the cue presented. Rats failing to express a stable sucrose preference during the baseline test sessions were excluded from the analysis. Performance on the last baseline test was used to constitute groups of equal sucrose preference.
After 1 week of training and one test session, rats (weighing ∼320–370 g) were anesthetized with isoflurane and standard stereotaxic procedures were used to implant 26 gauge stainless steel guide cannulae (Plastics One) targeted at the OFC (anterior–posterior: +3.5 mm; medial-lateral: ±2.6 mm; dorsal-ventral: −2.5 mm; coordinates relative to bregma). Guide cannulae tips were positioned 2 mm dorsal to the intended infusion site, making the final dorsal-ventral coordinate for injector tips −4.5 mm relative to bregma. Rats were given at least 1 week to recover from the surgery and were retrained for 1 week before testing.
During the week of retraining following surgery, rats were habituated to the infusion procedure by daily sham infusions. Injectors slightly shorter than the guide cannulae were inserted, but no infusion was administered. On test day, inactivation of the OFC was achieved by bilateral infusion of a drug solution containing the GABAA agonist muscimol (125 ng) and the GABAB agonist baclofen (125 ng; M/B). These doses were chosen because comparable doses of baclofen and muscimol in the OFC have been shown to have behavioral effects (Takahashi et al., 2009; St Onge and Floresco, 2010; Zeeb et al., 2010; Stopper et al., 2012). The drug solution was injected via 33 gauge injectors extending 2 mm below the guide cannula tip, in a volume of 0.5 μl delivered over 90 s. Saline was administered as the control treatment. Injectors were left in place for 2 min postinfusion to allow for diffusion. Rats were then placed back in their home cage for 7–9 min before being placed into the conditioning chambers (except for Experiment 2b, in which injections were given postsession and rats remained in their home cage until the next test session 24 h later).
Experiment 1a: Role of OFC in reinforced choice.
To determine the role of the OFC in the ability to use environmental cues to select the preferred option, 17 rats were trained to associate the two different cues with the two different outcomes. Performance during a preinjection baseline test was used to constitute two groups of equal sucrose preference. To determine the role of the OFC in the choice process, rats then received intra-OFC injections of either M/B (n = 8) or saline (n = 9) before a free choice test session.
Experiment 1b: Role of OFC in nonreinforced choice.
When the responses are reinforced, the preference for one port over another results not only from previously learned cue–outcome associations but also from the reinforcing effects of sucrose and polycose experienced during that test session. To determine the role of OFC in choices based purely on the remembered cue–outcome association, 21 naive rats were trained to associate the two different cues with the two different outcomes and later tested for their preference for either cue, but no reinforcer was delivered. In this situation of extinction, the insertion of cued choice trials in a random order could create a highly variable extinction history that could affect rats' responses in free choice trials. For this reason, this test session consisted entirely of free choice trials (simultaneous presentation of both cues). To determine the role of OFC in this choice process, rats were separated into two groups of equal sucrose preference (measured during a baseline test) and received intra-OFC injections of either M/B (n = 10) or saline (n = 11).
Experiment 2a: Role of OFC in reversal learning and memory (presession inactivation).
The rats used in Experiment 1a were retrained for a week in the cued choice task before being used for this experiment. The group assignment was shuffled between the two experiments. The assessment of reversal learning and recall was conducted over three consecutive sessions. The first session consisted of a drug-free baseline preference test with the initial contingencies (contingencies learned in training). On the second day, the rats were tested in a free choice test with reversed contingencies: the cue/port previously associated with sucrose now delivered polycose and vice versa. Rats had no prior experience with reversal learning and this was the first time they experienced reversed contingencies. To determine the role of OFC in reversal learning, the rats received intra-OFC infusions of either M/B (n = 8) or saline (n = 9) before the test session with reversed contingencies (reversal day 1). The memory of the new cue–outcome contingencies was tested drug-free the following day in a similar test session (reversal day 2).
Experiment 2b: role of OFC in reversal memory (postsession inactivation).
To determine when OFC activity is required for reversal memory, 20 naive rats were trained to associate the two different cues with the two different outcomes. As in Experiment 2a, the assessment of reversal learning and recall was conducted over three consecutive sessions. A first session measured the baseline, pre-reversal response preference. The performance on this day was used to constitute two groups of equal sucrose preference. On the second day, the contingencies were reversed (reversal day 1) and either M/B (n = 10) or saline (n = 10) was administered intra-OFC immediately after the session. The effect of this postsession infusion on reversal memory was determined the following day (reversal day 2).
Experiment 3: effect of repeated OFC inactivation on reversal memory.
Forty-two naive rats were trained to associate the two cues with the two outcomes. As in the previous experiments, reversal learning and recall were assessed over three consecutive test sessions: baseline, reversal day 1, and reversal day 2. The first test measured the baseline (drug-free) response preference before reversal. The performance on this day was used to constitute four groups of equal sucrose preference. The following day, the contingencies were reversed (reversal day 1). A final test, the following day, measured the recall of these newly learned contingencies (reversal day 2). Saline or M/B was injected intra-OFC on both reversal days, generating four experimental groups. A first group (Sal–Sal; n = 9) was injected with saline before the first session of reversal (reversal day 1) and the following day (reversal day 2). A second group (M/B–Sal; n = 12) was injected with M/B before the first session of reversal and with saline the following day. A third group (Sal–M/B; n = 11) was injected with saline before the first session of reversal and with M/B the following day. A last group (M/B–M/B; n = 10) was injected with M/B both before the first session of reversal and the following day. The effect of these different combinations of infusions was determined on reversal learning (reversal day 1) and reversal recall (reversal day 2).
Placement of cannulae and injector tips were verified in 50 μm cresyl violet-stained coronal brain slices with reference to a standard brain atlas (Paxinos and Watson, 1998). Figure 2 illustrates the placement of the injector tips for all rats included in this study. Only rats with injector tips located in the OFC were included in statistical analysis.
Rats' preference for one port over another was measured in free choice trials and expressed as a preference score [preference score = (left port choices − right port choices)/(left port choices + right port choices)]. A preference score of zero indicates that both ports were equally chosen, whereas a preference score of 1 or −1 indicates an absolute preference for one of the two ports. For convenience, the outcome associated with each port is indicated on the figures. The preference score is reported for the whole session or for bins of 10 trials to examine the evolution of preference within a session. Single-sample t tests were used to determine response preference, by testing the hypothesis that the preference score was different from zero. The effect of OFC inactivation on preference scores was analyzed by one-way ANOVAs or two-way repeated-measure ANOVAs (drug; trials). The cued choice trials measured the accuracy of the responses to each cue presented individually. This accuracy was expressed as a percentage of correct responses. The effect of OFC inactivation on accuracy measures was examined by two-way ANOVAs (drug; cue presented). The statistical rejection criterion of 0.05 was used for all analysis. The Student–Newman–Keuls test was used for all post hoc comparisons. A total of 12 rats were excluded from the analysis for the following reasons: incorrect cannulae placement (n = 6), brain infection (n = 2), death during surgery (n = 1), failure to acquire reward-seeking behavior (n = 1), unstable reward preference during baseline tests (n = 2; preference was considered stable if rats chose sucrose on >70% of the free-choice trials during the last two baseline tests, before any OFC injection).
The OFC is not required for choosing a preferred outcome
We trained rats in a two-reward task in which the onset of a cue light inside a reward port signaled availability of either sucrose or polycose, respectively, following insertion of the snout into the correct (i.e., cued) reward port. On test day, these “cued choice” trials (two-thirds of trials) were interspersed with “free choices” trials (one-third of trials) in which both cue lights inside both reward ports were presented simultaneously, and rats received sucrose or polycose, depending on which reward port they chose to visit (Fig. 1). The outcomes were chosen based on the known preference rats have for sucrose over polycose (Nissenbaum and Sclafani, 1987a, b; Sclafani et al., 1987).
We first tested the role of the OFC in the ability to use environmental cues to select a preferred outcome (Fig. 3A). Two groups of rats presenting similar sucrose preference in baseline condition (F(1,15) = 0.023, p = 0.881; Fig. 3B) received presession intra-OFC infusion of either saline or M/B, the GABAergic agonists muscimol and baclofen. Pharmacological inactivation of the OFC by intra-OFC infusion of M/B did not affect overall measures of performance; all rats completed the maximum number of trials in the session, and the time required to complete all trials was not altered (saline: 30.2 ± 5.9 min; M/B: 32.6 ± 5.2 min; mean ± SEM; F(1,15) = 0.09, p = 0.77).
In free choice trials, the simultaneous presentation of the sucrose and polycose cue signaled that both outcomes were available. When given the opportunity to choose between outcomes by visiting one of the two ports, rats expressed a clear preference for the sucrose solution over the polycose solution (saline: t(8) = 10.63, p < 0.001; M/B: t(7) = 14.75, p < 0.001). OFC inactivation did not affect this preference (F(1,15) = 0.28, p = 0.61; Fig. 3B). By assessing correct responding on cued choice trials, we gauged the accuracy of responding to each cue presented individually. While rats responded with greater accuracy to the sucrose cue than the polycose cue (F(1,30) = 10.21, p = 0.003), accuracy to both cues was high (>85%), indicating that rats were able to alternate their responses based on the signaled availability of the different outcomes. OFC inactivation did not affect response accuracy (treatment: F(1,30) = 0.52, p = 0.48; treatment by cue presented: F(1,30) = 0.76, p = 0.39; Fig. 3C).
The preference for one port over the other could result not only from previously learned cue–outcome associations, but also from the reinforcing effect of sucrose and polycose experienced during the test session. Consequently, the effects of OFC inactivation on reward choice might be masked by the immediate reinforcing effects of sucrose and polycose solution. To determine the role of OFC in choices based purely on the remembered cue–outcome associations, we trained two separate groups of rats in the same task and tested their preference for responding at the sucrose or polycose port without delivering the associated solution (Fig. 3A). The two groups expressed similar sucrose preference in baseline (drug-free) condition (F(1,19) = 1.27; p = 0.274; Fig. 3D). In the nonreinforced test session, intra-OFC M/B infusion significantly reduced overall responding. The analysis of the number of trials initiated over time revealed a significant effect of M/B inactivation (F(1,19) = 26.46, p < 0.001), a significant effect of time (F(19,361) = 166.71, p < 0.001), and an interaction between these factors (F(19,361) = 44.80, p < 0.001; Fig. 3E). Post hoc comparisons showed that early in the session, both groups of rats initiated similar number of trials, but rats receiving M/B progressively stopped initiating trials, causing the groups to diverge in this measure after 24 min (24 min: p = 0.037; 30 min: p = 0.007; 36 min: p = 0.002; 42 min and later time points: p < 0.001; Fig. 3E). Critically, despite the reduction in the number of trials initiated, OFC inactivation did not affect the relative distribution of responses in those trials that were initiated. All rats expressed a preference for the port previously associated with sucrose (all p's < 0.026), and that preference was not affected by M/B infusion (F(1,19) = 0.39, p = 0.54; Fig. 3D). To account for the possibility that responding later in the session may be more strongly affected by extinction learning, we analyzed responding early in the session (first 10 trials). This analysis confirmed that M/B did not affect preference for the port previously associated with sucrose (F(1,19) = 0.68, p = 0.42; Fig. 3D). Collectively, these findings indicate that neural activity within the OFC is not required for expression of cue-directed reward seeking under conditions of simultaneous choice, when choosing between two rewards that are unequally preferred.
Functional OFC is not required for reversal learning in a two-reward choice task, but for later expression of that learning
Extensive prior findings support a role for the OFC in reversal learning (Chudasama and Robbins, 2003; McAlonan and Brown, 2003; Schoenbaum et al., 2003; Kim and Ragozzino, 2005; Boulougouris et al., 2007; Ghods-Sharifi et al., 2008; Burke et al., 2009). Therefore, we tested the hypothesis that the OFC is required for reversal learning in the two-reward choice task by switching the outcome associated with each port. In two groups expressing comparable sucrose preference in baseline condition (F(1,15) = 0.29, p = 0.6), we examined performance after OFC inactivation on reversal learning (reversal day1) and on later expression of that learning assessed 24 h later in a drug-free state (reversal day 2; Fig. 4A). OFC inactivation did not affect overall responding on reversal day 1. All rats completed the maximum number of trials in the session and there was no significant effect on the time required to complete the trials (saline: 28.5 ± 4.5 min; M/B: 28.0 ± 5.2 min; mean ± SEM; F(1,15) = 0.004, p = 0.95). M/B infusion did not affect preference as indicated by performance in free choice trials when averaged across the entire session (F(1,15) = 0.06, p = 0.81; Fig. 4B); saline-treated and M/B-treated rats chose sucrose and polycose in approximately equal proportions, resulting in preference scores that did not significantly differ from zero (saline: t(8) = −0.098, p = 0.92; M/B: t(7) = −0.76, p = 0.47; Fig. 4B). This seemingly balanced preference did not result from a strategy of random response selection. Indeed, analysis of responding in cued choice trials revealed that rats responded to each individual cue with high accuracy. A two-way ANOVA conducted on accuracy revealed a significant effect of the cue presented (F(1,30) = 5.85, p = 0.02), with higher accuracy in response to the sucrose cue, but no effect of M/B as indicated by the absence of a significant treatment effect (F(1,30) = 0.24, p = 0.63), and no treatment × cue interaction (F(1,30) = 0.06, p = 0.81; Fig. 4C).
The seemingly balanced preference observed in free choice trials is better explained by the distribution of the choices over the course of the session. A two-way ANOVA indicated that rats' preference changed over the session (F(3,45) = 21.708, p < 0.001). Initially rats expressed a preference for the polycose port, which was previously associated with sucrose. Over the course of the session, rats progressively reversed that preference and adapted their behavior to the new location of sucrose (Newman–Keuls: preference scoret11–20 > preference scoret21–30, p = 0.001; preference scoret21–30 > preference scoret31–40, p = 0.029). This progressive reversal was not affected by OFC inactivation, as attested by the absence of a treatment × trial interaction (F(3,45) = 1.35, p = 0.27; Fig. 4D). These results indicate that intra-OFC infusion of M/B did not impair initial reversal learning in this procedure.
To determine whether this treatment affected later expression of reversal memory, all rats were retested drug-free the following day (reversal day 2; Fig. 4A). While M/B infusion was without noticeable effect on reversal day 1, performance on the reversal day 2 was significantly altered, with a reduction in the overall preference for the sucrose port (F(1,15) = 5.96, p = 0.027; Fig. 4B). This reduced preference for the sucrose port cannot be explained by reduced response accuracy to the sucrose cue. Indeed, responding to the individual reward cues measured in cued choice trials remained highly accurate. A two-way ANOVA revealed a higher accuracy in response to the sucrose cue (F(1,30) = 10.74, p = 0.003), but no treatment effect (F(1,30) = 0.03, p = 0.86) or treatment × cue interaction (F(1,30) = 0.08, p = 0.78; Fig. 4C).
The analysis of the distribution of the free choices over the session revealed a significant treatment effect (F(1,15) = 6.007, p = 0.027), trial effect (F(3,45) = 24.082, p < 0.001), and interaction (F(3,45) = 3.139, p = 0.034). Post hoc comparisons showed that the preference score of the two groups differed early in the session (t1–10: p = 0.002; t11–20: p = 0.032). Rats previously injected with saline on reversal day 1 showed good recall of the new contingencies and demonstrated a preference for the new sucrose port early and throughout the length of the session (preference score <0; all p's < 0.047). In contrast, rats previously infused with M/B on reversal day 1 chose predominantly the polycose port early in the session (the port associated with sucrose pre-reversal t(7) = 2.77; p = 0.028), with a progressive shift in this preference to the new location of sucrose, indicating poor recall of the new contingencies (Fig. 4D). Targeted ANOVAs revealed that, unlike saline-treated rats, which showed a different preference score on the first and second day of reversal (F(1,24) = 14.684, p = 0.005), rats with OFC inactivation expressed a remarkably similar pattern of choice over both reversal sessions (no main effect of day: F(1,21) = 0.47, p = 0.52; or day × trial interaction: F(3,21) = 0.54, p = 0.66), indicating that there was virtually no recall of the new contingencies on the second reversal session.
The fact that OFC inactivation before reversal learning impairs later expression of this learning suggests that normal activity in the OFC is required to consolidate the newly acquired cue–outcome associations. Alternatively, OFC inactivation could impair reversal memory by generating a conflict between multiple memory systems. The following experiments tested these two hypotheses.
Postsession OFC inactivation does not impair expression of reversal learning
Infusion of M/B can reduce neuronal activity for several hours. Therefore, it is likely that activity in the OFC was suppressed both during and after the first reversal session in Experiment 2a, both of which could potentially contribute to the impaired recall observed in that experiment. To clarify when the OFC is required for good performance the following day, this experiment examined the effect of post-reversal infusions on later reversal recall (reversal day 2). Infusion of M/B immediately after reversal learning on reversal day 1 had no effect on reversal recall (F(1,18) = 0.53, p = 0.48). Statistical comparison of the presession (from the prior experiment, above) and postsession effects of M/B revealed a significant interaction between drug and time of infusion (F(1,33) = 5.14, p = 0.03). Unlike pre-reversal infusions that significantly impaired later reversal (Newman–Keuls: p = 0.02), post-reversal infusions had no effect on reversal recall (Newman–Keuls: p = 0.47; Fig. 5).
Repeated OFC inactivation restores normal reversal memory
Our results thus far show that when rats acquire a reversal after an intra-OFC M/B infusion, they show impaired reversal recall when later tested drug-free. We hypothesized that this impairment was the result of a conflict between the OFC still encoding pre-reversal contingencies and other systems encoding the new contingencies. If this is true, then OFC inactivation before reversal days 1 and 2 should prevent such a conflict and allow the OFC-independent processes that acquired the new contingencies on day 1 to guide behavior on day 2 in a manner consistent with the new contingencies, i.e., repeated M/B infusion should restore normal performance. In this experiment, intra-OFC infusions of saline and/or M/B were administered on both reversal days, generating four experimental groups (Sal–Sal; M/B–Sal; Sal–M/B; and M/B–M/B; Fig. 6A). These four groups did not differ in their baseline (drug-free) preference for the sucrose reward (F(3,38) = 0.221, p = 0.88). As expected, an ANOVA revealed no significant effect of the drug treatment on preference scores the first day of reversal (F(3,38) = 0.207, p = 0.89; Fig. 6B,C).
Preference scores on the second day of reversal were analyzed by a two-way ANOVA with drug treatment on day 1 and drug treatment on day 2 as separate factors. This two-way ANOVA revealed no main effect of M/B injected on day 1 (F(1,38) = 2.72, p = 0.11) or on day 2 (F(1,38) = 0.41, p = 0.52), but a significant interaction between the injection of M/B on day 1 and 2 (F(1,38) = 12.43, p = 0.001). Post hoc Newman–Keuls comparisons showed that while intra-OFC M/B before the recall session significantly impaired recall in rats previously injected with saline (Sal–Sal vs Sal–M/B: p = 0.007), it significantly improved recall in rats previously injected with M/B (M/B–Sal vs M/B–M/B: p = 0.043; Fig. 6B). Analysis of the evolution of preference during the second day of reversal, depicted in Figure 6C, revealed a significant treatment × trial interaction (F(9,114) = 4.372, p < 0.001). Post hoc Newman–Keuls comparisons confirmed that OFC inactivation either on the first day (M/B–Sal) or the second day (Sal–M/B) of reversal impairs reversal memory early in the session (Sal–M/B vs Sal–Sal: trials1–10, p = 0.035; M/B–Sal vs Sal–Sal: trials1–10, p < 0.001; trials11–20, p < 0.001). However, M/B injection on both reversal days restores normal performance throughout the session (M/B–M/B vs Sal–Sal: all p's > 0.123). To further analyze the effect of the different treatment in the ability to recall the newly learned contingencies, we ran a two-way repeated-measures ANOVA comparing the performance on the last 10 trials of reversal day 1 with the first 10 trials of reversal day 2. This ANOVA revealed a significant effect of treatment (F(3,38) = 2.86, p = 0.05) and day (F(1,38) = 23.10, p < 0.001), and a significant interaction between these factors (F(3,38) = 2.97, p = 0.044). Post hoc Newman–Keuls comparisons showed that inactivation on either reversal day 1 or reversal day 2 impaired reversal recall (t31–40 d 1 vs t1–10 d 2; p < 0.001 for both M/B–Sal and Sal–M/B). However, control rats and rats given OFC inactivation on both days showed stable performance, indicative of good reversal recall (t31–40 d 1 vs t1–10 d 2; p > 0.11 for both Sal–Sal and M/B–M/B).
We show here that while OFC inactivation during reversal learning impairs later reversal memory, normal reversal memory can be restored by a second OFC inactivation before memory testing. These results indicate that the poor reversal memory observed 24 h after OFC inactivation is due to a conflict between OFC and other memory systems.
To probe the role of the OFC in choice behavior under initial learned contingencies and after reversal of those contingencies, we used a novel two-reward choice procedure in which rats were given a choice between sucrose or polycose. We found that OFC inactivation did not alter the rat's preference for sucrose over polycose and did not impair the acquisition of new cue–outcome contingencies in reversal. However, OFC inactivation before, but not immediately after, reversal learning impaired recall of the new contingencies when assessed 24 h later. Importantly, normal reversal memory could be restored by a second OFC inactivation immediately before the reversal recall test. Collectively, these findings indicate that other brain regions can compensate for the absence of a functional OFC, but the OFC normally exerts hierarchical control on choice behavior. These findings extend understanding of OFC contributions to behavior guided by outcome expectancy.
OFC inactivation does not affect sucrose preference over polycose
Sucrose and polycose activate different taste receptors on the rodent tongue, generating distinct taste experiences (Nissenbaum and Sclafani, 1987a; Treesukosol et al., 2011). In the present study, the sucrose and polycose solutions had similar caloric content, yet sucrose was highly preferred. This preference is also observed when a gastric fistula prevents the absorption of the solutions (Nissenbaum and Sclafani, 1987b), indicating that preference for sucrose over polycose depends essentially on their distinctive flavors and not on postingestive consequences. It is generally accepted that preference is determined by the expected value of the different options, the option with highest expected value being the one selected (Padoa-Schioppa, 2011). While the expected value is believed to be computed in the striatum (Montague et al., 1996; Joel et al., 2002; Roesch et al., 2009), the OFC encodes specific sensory aspects of an outcome, such as its taste, texture, etc., from which value can potentially be derived (Rolls and Grabenhorst, 2008; Gottfried and Zelano, 2011; McDannald et al., 2011). Though suggestive of a critical role in outcome preference, we found that OFC inactivation does not affect taste preference; rats continued to express preference for sucrose over polycose. It could be argued that this result reflects the expression of previously acquired cue–response associations, presumably stronger in the case of the sucrose cue, rather than intact taste preference. However, OFC inactivation did not impair the acquisition of reversal learning, indicating that rats remained sensitive to outcome value and adapted their responses to obtain their preferred outcome. In agreement with previous studies (Izquierdo et al., 2004; Agustin-Pavón et al., 2011), we show that sensory qualities can confer a higher or lower value to an outcome and its associated cue and response, independently of the OFC.
OFC inactivation does not impair reversal learning in a two-reward choice task
Numerous studies have reported that the OFC is essential to rapidly update contingencies in various reversal paradigms (Chudasama and Robbins, 2003; McAlonan and Brown, 2003; Schoenbaum et al., 2003; Kim and Ragozzino, 2005; Boulougouris et al., 2007; Ghods-Sharifi et al., 2008; Burke et al., 2009). In contradiction with those studies, we show here that inactivation of the OFC did not impair the initial acquisition of new contingencies in reversal. The reversal procedure used in this study differs from more classical reversal procedures in several ways and several methodological factors can explain these apparent discrepancies in the results. A first reason for the lack of effect of OFC inactivation on reversal learning might reside in the presentation of cued choice trials. In the present study, reversal learning was evidenced by a reversal of port preference in free choice trials. However, unlike most studies, these free choice trials were intermixed with cued choice trials that allowed rats to experience the new cue–outcome contingencies without having to explore new response strategies. This absence of an exploration requirement might explain the absence of effect of OFC inactivation on reversal learning. Without cued choice trials, reversal learning might be more sensitive to OFC inactivation. A second possible reason might reside in the type of discrimination training. Most prior studies investigated the role of OFC in reversal in the context of discrimination between a positive cue (signaling a positive outcome) and a negative cue (signaling an aversive outcome or the absence of reinforcement). As a result, subjects learn to respond to the positive cue and to inhibit responding to the negative cue. When the cue–outcome contingencies are reversed, subjects must inhibit responding to the previously positive cue and overcome the response inhibition produced by the previously negative cue. The OFC has been shown to be essential for the later, but not the former, process (Schoenbaum and Setlow, 2005; Tait and Brown, 2007; Burke et al., 2009). Indeed, some studies, along with our results from the extinction test, indicate that OFC inactivation facilitates the suppression of a response that is no longer reinforced (Tait and Brown, 2007; Walton et al., 2010). Here, both cues were associated with positive outcomes and there was no obvious sign of response inhibition to a particular cue, as evidenced by high rates of correct responding when cues were presented individually (cued choice trials). Therefore, in the two-reward choice task, reversal consisted of switching between two responses previously identified as rewarding and did not require subjects to overcome response inhibitions. This absence of response disinhibition requirements could explain the lack of effect of OFC inactivation on reversal learning. Whatever the reason(s), the fact that OFC inactivation did not impair reversal learning indicates that other brain regions were sensitive to the sensory quality of the outcome and participated in the reversal learning.
OFC inactivation impairs reversal recall: evidence for multiple memory systems
Inactivation of the OFC before the reversal session impaired later expression of that learning when assessed drug-free 24 h later. These results could indicate a role of OFC on the consolidation of reversal learning (Laurent and Westbrook, 2009; LaLumiere et al., 2010). However, postsession OFC inactivation did not impair reversal memory, revealing the importance of normal activity in the OFC during the encoding of the new contingencies for the later expression of that learning.
Alternatively, this impaired recall of reversal learning could indicate that circuits recruited to compensate for the absence of a functional OFC fail to durably encode the new contingencies (Quirk et al., 2000). Related to this idea, it has recently been shown that OFC inactivation can increase sensitivity to instantaneous contingencies at the detriment of the representation of an integrated reward history (Riceberg and Shapiro, 2012). However, the fact that OFC inactivation on both reversal learning and recall restored normal performance indicates that a functional OFC was not required to durably encode the new contingencies and to later use this information to guide behavior.
The finding that new contingencies are better remembered when reversal learning and memory test occur in the same condition is in agreement with a previous study showing that cortex inactivation can produce state-dependent learning (Fortis-Santiago et al., 2010). A hypothesis sometimes advanced to explain state-dependency is that the manipulation (in our case OFC inactivation) generates an interoceptive stimulus that can modulate memory retrieval by acting as a contextual cue (Bouton et al., 1990; Overton, 1991). However, OFC inactivation does not automatically produce state-dependency. For example, while OFC inactivation during reversal learning impairs later reversal memory, a similar inactivation during strategy shift does not impair memory of the newly learned strategy (Young and Shapiro, 2009). Note that the opposite pattern is observed after inactivation of the medial prefrontal cortex (mPFC; Rich and Shapiro, 2007; Young and Shapiro, 2009). These results are difficult to reconcile with the view that cortical inactivation produces behavioral effects via cueing properties. Indeed, it is not clear why OFC inactivation would produce a discriminative cue during reversal learning but not during strategy switch, and conversely why mPFC inactivation would produce a discriminative cue during strategy switch but not during reversal learning.
Instead, a more parsimonious explanation of our results is that that poor reversal memory observed after OFC inactivation results from a conflict between different memory systems (White and McDonald, 2002). Indeed, subjects readily reversed after OFC inactivation, indicating that other brain circuits compensated for the absence of functional OFC to encode the new cue–outcome associations. At test drug-free 24 h later, these other brain circuits potentially conflicted with the OFC that was again fully functional but still encoding pre-reversal contingencies. We suggest that when these different memory systems conflict with one another, the OFC exerts hierarchical control and ultimately guides behavior. This hypothesis predicts that inactivation on both reversal days should abolish conflict between the two memory systems and restore normal behavior. This is exactly what we observed. Inactivation of the OFC on either day of reversal impaired expression of reversal learning. Inactivation of the OFC on both days restored normal performance, suggesting that the poor reversal memory observed 24 h after OFC inactivation resulted from a conflict between information encoded in the OFC and information encoded in other brain regions.
Interaction between OFC and other neural circuits
Our results indicate that the OFC exerts hierarchical control on other brain regions involved in choice behavior. While this study did not identify these other brain regions, the basolateral amygdala (BLA) and the striatum are potential candidates. Indeed, neurons encoding outcome contingencies are found in both the BLA and most subregions of the striatum (Setlow et al., 2003; Saddoris et al., 2005; Paton et al., 2006; Stalnaker et al., 2010), both BLA and striatum (dorsal and ventral) receive OFC projections (Gabbott et al., 2005; Schilman et al., 2008), and both BLA and dorsomedial striatum contribute to reversal learning (Schoenbaum et al., 2003; Ragozzino, 2007). However, reversal of outcome contingencies in BLA neurons depends on OFC integrity (Saddoris et al., 2005), indicating that BLA might not effectively guide behavior in the absence of the OFC (Stalnaker et al., 2007). Rather, we suggest that OFC exerts its hierarchical control on decision-making via OFC-striatum projections. This circuit would provide a way for model-based information originating from the OFC to be integrated with, and possibly overwrite, model-free information in the striatum, thereby allowing subjects to adjust their choices to the current value of the outcome (Pickens et al., 2005; Frank and Claus, 2006; McDannald et al., 2012; Gremel and Costa, 2013).
This work was supported by funding from the State of California for Medical Research on Alcohol and Substance Abuse through the University of California at San Francisco.
The authors declare no competing financial interests.
- Correspondence should be addressed to Ronald Keiflin, PhD, UCSF Campus Box 0444, 675 Nelson Rising Lane, San Francisco, CA 94143.