Abstract
An inability to adjust choice preferences in response to changes in reward value may underlie key symptoms of many psychiatric disorders, including chemical and behavioral addictions. We developed the rat gambling task (rGT) to investigate the neurobiology underlying complex decision-making processes. As in the Iowa Gambling task, the optimal strategy is to avoid choosing larger, riskier rewards and to instead favor options associated with smaller rewards but less loss and, ultimately, greater long-term gain. Given the demonstrated importance of the orbitofrontal cortex (OFC) and basolateral amygdala (BLA) in acquisition of the rGT and Iowa Gambling task, we used a contralateral disconnection lesion procedure to assess whether functional connectivity between these regions is necessary for optimal decision-making. Disrupting the OFC-BLA pathway retarded acquisition of the rGT. Devaluing the reinforcer by inducing sensory-specific satiety altered decision-making in control groups. In contrast, disconnected rats did not update their choice preference following reward devaluation, either when the devalued reward was still delivered or when animals needed to rely on stored representations of reward value (i.e., during extinction). However, all rats exhibited decreased premature responding and slower response latencies after satiety manipulations. Hence, disconnecting the OFC and BLA did not affect general behavioral changes caused by reduced motivation, but instead prevented alterations in the value of a specific reward from contributing appropriately to cost-benefit decision-making. These results highlight the role of the OFC-BLA pathway in the decision-making process and suggest that communication between these areas is vital for the appropriate assessment of reward value to influence choice.
Introduction
Impairments in making appropriate decisions and updating choice preferences as reward values change are observed in many psychiatric disorders, including pathological gambling and substance abuse. The Iowa Gambling Task (IGT) analyzes “real-world” decision-making in a laboratory setting. Patients with damage to the ventromedial prefrontal cortex (encompassing the orbitofrontal cortex, OFC) or amygdala are impaired on the IGT, preferring the disadvantageous options associated with larger immediate gain but greater long-term loss (Bechara et al., 1994, 1999). We have developed an animal decision-making test modeled after the IGT: the rat gambling task (rGT; Zeeb et al., 2009). Similar to the optimal strategy on the IGT, animals learn to maximize their sugar pellet profits by avoiding riskier options associated with larger immediate reward but greater time-out periods (loss) during which reward cannot be earned. Comparable to data collected from human subjects, bilateral OFC or basolateral amygdala (BLA) lesions administered before rGT training retarded learning of the optimal strategy (Zeeb and Winstanley, 2011).
The analogous effects of OFC and BLA lesions suggests that cross talk between these regions may be important in using cost-benefit information to guide optimal decision-making. The OFC and BLA are reciprocally and functionally connected (Kita and Kitai, 1990; McDonald, 1991; Carmichael and Price, 1995; Schoenbaum et al., 2000). A disconnection procedure—in which unilateral lesions of two regions in contralateral hemispheres are combined—can be used to determine whether connectivity between these areas is important for behavioral output. Using this technique, Baxter et al. (2000) demonstrated that an OFC-amygdala interaction was required for nonhuman primates to avoid responding for stimuli that predicted a devalued reward. Hence, functional connections between the OFC and amygdala are necessary for guiding behavior based on the incentive value of the expected outcome. Therefore, OFC-BLA connectivity may also be important when learning a complex decision-making task such as the rGT, in which animals must use positive and negative information to guide decision-making. To test this hypothesis, rats received a contralateral disconnection, a control lesion, or a sham surgery before rGT training.
Although satiety manipulations have been previously used to alter an animal's motivational state (Cardinal et al., 2000; Cardinal and Howes, 2005; Floresco et al., 2008b; Simon et al., 2009; St Onge and Floresco, 2009), the effects of reward devaluation on risky decision-making are unknown. Communication between the OFC and BLA may contribute to such a behavioral adaptation. Hence, animals were subjected to a sensory-specific satiety (SSS) procedure, in which subjects were sated with the reinforcer before rGT testing. Rats were also tested in extinction following SSS to determine whether animals modulate decision-making preferences using stored representations of the incentive value of each outcome instead of new/updated values obtained through re-experience of the devalued reward during the task. The effects of these manipulations were contrasted with general changes in motivational state caused by acute and long-term satiety with regular chow. To our knowledge, dissociating the impacts of SSS from both general acute and chronic satiation has not yet been determined in a complex decision-making task.
Materials and Methods
Subjects
Subjects were male Long–Evans rats (n = 56; Charles River Laboratories) weighing between 300 and 350 g at the beginning of the experiment and housed in a temperature-controlled colony room under a 12 h reverse light cycle (lights off at 8:00 A.M.). Animals were initially housed in pairs and then single housed after surgery for 2–3 d during recovery. Rats were then pair-housed again if possible. Animals were tested once daily, at approximately the same time, 5–6 d per week. Testing took place between 9:00 A.M. and 5:00 P.M. Water was always available ad libitum. One and a half weeks prior to beginning rGT training, animals were food restricted to ∼85% of their free-feeding weight by gradually decreasing their amount of food to ∼14 g of standard rat chow per day. During testing, animals were fed immediately upon completion of the task.
At the end of the experiment, animals were killed by exposure to an increasing concentration of carbon dioxide. The brains were then removed and postfixed in a solution of 4% formaldehyde dissolved in PBS for 24 h before being stored in a 30% sucrose solution. Using a cryostat, 40 μm sections were taken throughout the area of interest. Slices were then stained with cresyl violet. The extent of the lesions were determined and mapped onto standardized sections of the rat brain (Paxinos and Watson, 1998).
All experiments were performed in accordance with the Canadian Council of Animal Care and protocols were approved by the animal care committee of the University of British Columbia.
Surgery
Prior to rGT training, rats received either a lesion or sham surgery. A complete disconnection between the OFC and BLA occurs only in the contralateral lesion group. Two alternative lesion-control groups, in addition to the sham-control groups, were included to determine whether any changes in behavior observed in the contralateral lesion group genuinely result from disrupted communication between the OFC and BLA. The unilateral lesion group was included to establish whether a single unilateral lesion of either the OFC or BLA was sufficient to disrupt behavior. Minor deficits caused by unilateral lesions could potentially combine to produce a substantial deficit, often referred to as a “mass-action” effect (Chudasama et al., 2003). The ipsilateral lesion group, in which a unilateral lesion to both the BLA and OFC are administered within the same hemisphere, was included to control for this possibility. In the ipsilateral lesion group, communication between the OFC and BLA is still possible within the undamaged hemisphere, in contrast to the contralateral lesion group, in which OFC-BLA connectivity is eliminated.
The stereotaxic coordinates, preparation of quinolinic acid (Sigma-Aldrich), and general surgical methods were identical to those used previously (Zeeb and Winstanley, 2011). A fresh solution of 0.09 m quinolinic acid dissolved in PBS was made fresh daily and adjusted with NaOH to a pH of ∼6.6. Animals were anesthetized with isoflurane and then received an injection of 0.05 ml of 10 mg/ml Anafen (ketoprofen, s.c.) and 10 ml of lactated Ringer's solution (s.c.). Rats were then secured in a stereotaxic frame (David Kopf Instruments) with the incisor bar set at −3.3. Subjects received either an injection of PBS (sham) or 0.09 m quinolinic acid (lesion) into one, the BLA or OFC, or both regions, of a single hemisphere. The hemisphere chosen for the injection (left or right) was counterbalanced within each group. The coordinates (Paxinos and Watson, 1998) and rates of infusion for the BLA were as follows: site 1: anterior–posterior (AP) from bregma, −3.0; medial-lateral (ML) from the midline, ±4.8; dorsoventral (DV) from dura, −7.8, 0.1 μl over 2 min; site 2: AP, −2.8; ML, ±4.8; DV, −7.8, 0.2 μl over 2 min. Coordinates for the OFC were as follows: site 1: AP, +4.0; ML, ±0.8; DV, −3.4, 0.2 μl over 3 min; site 2: AP, +3.7; ML, ±2.0; DV, −3.6, 0.3 μl over 2 min; site 3: AP, +3.2; ML, ±2.6; DV, −4.4, 0.3 μl over 3 min.
Rats received either a unilateral lesion of the OFC (n = 6) or BLA (n = 6; unilateral lesion group), a lesion of the OFC and BLA in ipsilateral hemispheres (n = 12, ipsilateral lesion group), or a lesion of the OFC and BLA in contralateral hemispheres (n = 16, contralateral lesion group). A total of 16 rats received a sham surgery (unilateral OFC n = 4; unilateral BLA n = 4; ipsilateral n = 4; contralateral n = 4). One animal from the sham-control group failed to recover from surgery. Statistical analysis using a repeated-measures ANOVA revealed that there was no significant difference in choice preferences on the rGT between the type of sham surgery (i.e., unilateral, ipsilateral, or contralateral) an animal received, therefore these rats were combined to form a homogenous sham-control group. Likewise, choice preferences did not differ significantly between animals that received a unilateral OFC or unilateral BLA lesion. As such, these rats were combined to form a single unilateral lesion group.
rGT
Apparatus.
rGT training and testing was similar to methods described previously (Zeeb et al., 2009; Zeeb and Winstanley, 2011). Briefly, testing took place in 16 standard five-hole operant chambers enclosed within a ventilated cabinet (Med Associates). Each chamber was fitted with an array of 5 stimulus-response holes located on a curved wall 2 cm above a bar floor. A stimulus light was located within each hole and nose-poke responses into the apertures were detected by a horizontal infrared beam. A food tray fitted with a stimulus light and infrared beam was located on the opposite wall. Sucrose pellets (45 mg; Bio-Serv) could be dispensed into the food tray by an automatic pellet dispenser. The entire chamber could be illuminated by a house light located at the top of the chamber. Chambers were controlled by software written in MedPC running on an IBM-compatible computer.
Habituation and training.
Animals received two 30 min habituation sessions during which sucrose pellets were placed in the response holes and food tray. Rats were then trained to respond to a stimulus light located within a stimulus-response hole within 10 s for a sucrose pellet. The spatial location of the stimulus light varied between trials across holes 1, 2, 4, and 5 (left to right). Once animals were consistently completing 100 trials with >80% trials correct and <20% trials omitted, rats were trained on a forced-choice version of the rGT for seven sessions. This ensured that all animals had equal experience with all four reinforcement contingencies to prevent the development of a simple bias toward a particular hole.
rGT.
A schematic diagram of the trial structure of the rGT is presented in Figure 1. Each session lasted 30 min, during which animals were allowed to complete as many trials as possible. Rats started a trial by making a nose-poke response into the illuminated food tray. This response extinguished the tray light and initiated a 5 s intertrial interval (ITI) during which the animal needed to withhold from making a response. Similar to the measurement of motor impulsivity on the five-choice serial reaction time task (5CSRTT; Robbins, 2002), premature responses made at the array during this ITI were signaled by the illumination of the house light for 5 s, after which the tray light was re-illuminated and the animal could start another trial. After the ITI, holes 1, 2, 4, and 5 were illuminated for 10 s (during the forced choice sessions, only one hole was illuminated on any given trial) and the animal could respond in any hole. If the animal failed to respond at the array, the trial was scored as an omission, the lights in the array were extinguished, and the tray light was re-illuminated, allowing the animal to start another trial. A nose-poke response in any hole extinguished the lights in all of the holes and the animal was either rewarded or punished.
Each hole (option) was associated with a different number of rewarding pellets (ranging from 1 to 4), a different probability of receiving reward (ranging from 0.4 to 0.9), and a different magnitude of loss, the duration of a punishing time-out period (ranging from 5 to 40 s; Fig. 1). If the animal was rewarded, the tray light was illuminated and the corresponding number of sucrose pellets was immediately delivered into the food tray. Collection of the reward initiated the next trial. If the animal was punished, no pellets were administered and the light in the hole chosen flashed at a frequency of 0.5 Hz for the duration of the time-out period, during which the animal could not obtain any further reward. At the end of the time-out period, the light in the array was extinguished and the food tray light was illuminated, indicating that the next trial could be started. Perseverative responses made at the array after reward and the array or food tray during the time-out period were recorded but not punished.
The schedule of reinforcement on the rGT was designed such that the advantageous options were associated with greater long-term reward due to the interaction of reward magnitude, probability of receiving reward, and duration of the time-out period. Objectively ranking the pellet options, by calculating the maximum amount of pellets that could be earned in a 30 min session if an option was chosen exclusively, indicated that—in terms of reward earned per unit time—the two-pellet choice (P2) was optimal, followed by P1, P3, then P4 (Zeeb et al., 2009; Fig. 1). The location of the pellet choice options was counterbalanced across animals such that there were two versions of the task: version A and version B. According to the hole order in the operant chamber (left to right: 1, 2, 4, and 5), the order of pellet options in version A was P1, P4, P2, and P3, and that in version B was P4, P1, P3, and P2. A total of 27 rats were trained on version A and 28 on version B, counterbalanced within each group. Animals were trained on the rGT until statistically stable patterns of choice behavior were observed over three sessions (see Data and Statistical Analysis section), after which satiety manipulations began.
Satiety manipulations
Methods.
After rGT training, rats were subjected to four different satiety manipulations. For the first satiety manipulation (SSS), rats were allowed free access to the same sucrose pellets used as reward in the rGT for 1 h in their home cage and then tested on the rGT. The second manipulation was identical to SSS, but rats were tested on the rGT in extinction (i.e., trials that would have resulted in delivery of a sucrose pellet were no longer rewarded and animals could immediately start another trial; SSS-Ext). On the third manipulation, acute food satiety (food satiety), rats were sated with regular chow in their home cage (the same food the animals normally received). After these satiety manipulations, animals were returned to a clean home cage upon completion of the rGT testing so that no hoarded food could be obtained. A period of 1 week occurred between each acute satiation manipulation, during which rats were tested on the rGT for 4 or 5 d. Immediately after acute food satiety, animals were removed from food restriction (i.e., received chow ad libitum) and were tested on the rGT for 4 further sessions (chronic satiation).
Rationale and predictions.
Previous experiments using an SSS manipulation have found that animals with an OFC-amygdala disconnection are unable to use information regarding the incentive value of the reinforcer to alter behavior (Baxter et al., 2000). However, when an action and outcome were repeatedly presented together in a progressive-ratio task, SSS decreased responding similarly in control animals and in subjects with an OFC-amygdala disconnection (Baxter et al., 2000). The authors therefore concluded that OFC and amygdala connectivity is critical in associating the incentive value of a reward with an action, but only if the stimuli signaling the reward and the outcome are not explicitly paired such that the animal must rely on stored representations of value. In the present study, animals were subjected to SSS immediately before rGT testing to determine the effects of reward devaluation on decision-making preferences. However, animals would not be forced to rely on stored representations of the expected value associated with each option, because new or updated values could be obtained through re-experience of the devalued reward during the test session. Therefore, animals were also tested in extinction after SSS (SSS-Ext); this manipulation allowed us to determine whether rats would alter their choice preferences using stored representations of the incentive value of the outcome associated with each option. Extrapolating the findings from Baxter et al. (2000) to risky decision-making behavior using the rGT, we predicted that rats with an OFC-BLA disconnection would be able to alter their decision-making preferences following SSS, but not after SSS-Ext.
Animals were also exposed to acute (food satiety) and long-term (chronic satiation) satiation with regular chow to determine whether a decrease in motivational state would alter decision-making preferences. A similar pattern of decision-making following SSS and acute food satiety would suggest that the effects of SSS were due to the animals' general decreased motivation to earn food, rather than from any recalibration of the cost-benefit analysis caused by devaluing the soon to-be-earned reward. Furthermore, previous studies have shown that acute satiety with regular chow does not alter decision-making preferences on other cognitive tasks (Cardinal et al., 2000, Cardinal and Howes, 2005; Floresco et al., 2008b; Simon et al., 2009). However, a longer period of satiation (4–6 d) caused animals to choose suboptimally on a probability-discounting task, during which rats chose between options leading to a small, certain reward or a large, but probabilistic reward (St Onge and Floresco, 2009). Therefore, we predicted that acute food satiety would not alter decision-making preferences in any group, whereas chronic satiety may impair the animals' ability to maintain the optimal strategy.
Data and statistical analyses
Analysis of the variables measured on the rGT was conducted according to methods described previously (Zeeb et al., 2009; Zeeb and Winstanley, 2011). For each testing session, the percent of trials on which an animal chose each option and the percent of omitted trials were calculated. The percent of premature responses made was calculated as a fraction of the number of trials initiated. An arcsine transformation was performed prior to statistical analysis of variables expressed as a percentage to limit the effect of an artificially imposed ceiling (McDonald, 2009). The number of perseverative responses made during the punishment period was analyzed as a fraction of the total amount of punishment experienced. Likewise, perseverative responses made after reward delivery were calculated as a fraction of the total number of trials rewarded. In addition, the total number of trials completed and the latency to respond at the array and to collect reward for each choice option were analyzed.
Statistical analysis was conducted using SYSTAT for Windows (version number 12.00.08). For all analyses, p ≤ 0.05 denoted significance. Data were analyzed using repeated-measures ANOVA with choice and/or session as within-subjects factors and group (two levels: lesion vs sham) as a between-subjects factor. Based on the results from Baxter et al. (2000), we made an a priori assumption that the contralateral lesion group, but not the other control lesion groups, would differ from the sham-operated group. We therefore conducted a series of planned analyses comparing each lesion group with the sham-operated control group, similar to previous disconnection lesion studies (e.g., Baxter et al., 2000; Chudasama et al., 2003). Post hoc analysis was conducted using a paired-sample t test when comparing data within a group or a two-sample t test when comparing data between groups.
During task acquisition, choice preferences on the rGT were analyzed in blocks of five sessions using a repeated-measures ANOVA, with session (five levels) and choice (four levels, P1–P4) as within-subjects factors and group (two levels: sham-control group vs lesion group) as a between-subjects factor. A followup analysis using a repeated-measures ANOVA comparing the choice of each option across each block was also conducted if a main effect of group or choice × group was observed.
Stable performance on the rGT occurred when a statistically stable pattern of choice, number of trials completed, and premature responses was observed across at least three sessions (i.e., where p > 0.05 in a repeated-measures ANOVA for either choice × session or session). Analysis of baseline behavior was conducted using the average of these last three sessions before satiety manipulations. To compare the effects of each satiety manipulation, a within-subjects factor of satiety day (two levels: baseline average vs satiety manipulation) was included in the ANOVA. The difference between the total number of trials completed at baseline and on each satiety test day was also determined. On the first day of chronic food satiety, one animal from the contralateral lesion group and three animals from ipsilateral lesion group were excluded due to a technical error that confounded the data obtained. Animals' choice preferences were not permanently altered by any acute satiety manipulation, as indicated by a lack of session × choice effect when comparing the baseline average with either the average of the last three sessions before SSS during extinction challenge or acute food satiety (all groups, session × choice, all F < 0.640, not significant [NS]).
Using the animals' baseline behavior, the sum of the best (percentage choice of P1 and P2) and worst (percentage choice of P3 and P4) options, as well as the difference score (percentage choice of the best options minus the worst options) was also determined. A two-sample t test was used to analyze any group differences for these measures. Similar to the analysis of subjects tested on the IGT (Denburg et al., 2005) and previous research using the rGT (Zeeb et al., 2013), animals were excluded if their score was significantly < 0 (p < 0.05, one-tailed). Five rats (two sham-control rats and three lesion rats) were excluded based on this criterion.
The rGT version that the animal was tested on did not significantly affect choice preferences during task acquisition (days 1–20, all rats, version: F(1,31) = 0.134, NS; version × group: F(3,31) = 2.208, NS; choice × version × group: F(9,93) = 1.139, NS). Furthermore, the physical location of the options in the chamber (left or right side, P1 and P4, or P2 and P3, respectively) did not bias decision-making during task acquisition (e.g., days 1–5, contralateral lesion vs sham-control group: side: F(1,20) = 1.405, NS; side × lesion: F(1,20) = 2.113, NS). Although a main effect of version was observed when analyzing data from the last three stable testing sessions (all rats, version: F(1,31) = 4.813, p = 0.04), the version the animals were tested on did not affect choice patterns differentially between each group (all rats, choice × group × version: F(9,93) = 0.956, NS; session × choice × version: F(6,186) = 1.601, NS; session × choice × group × version: F(18,186) = 0.592, NS). Therefore, consistent with previous studies (Zeeb et al., 2009, 2013; Zeeb and Winstanley, 2011), animals were not separated by version for subsequent analyses.
Results
Lesion analysis
Animals were excluded from the experiment if the lesion area was too small or if damage extended far into surrounding regions (Fig. 2). In addition, three rats from the sham-control group exhibited clear neuronal cell damage in the target region and were therefore removed from the study to ensure that our sham-operated control group did not contain rats with minor lesions. A total of 10 animals remained in the sham-control group (version A, n = 3; version B, n = 7), 11 animals remained in the unilateral lesion group (version A, n = 5; version B, n = 6), eight animals remained in the ipsilateral lesion group (version A, n = 4; version B, n = 4), and 10 animals remained in the contralateral lesion group (version A, n = 5; version B, n = 5). Within the ipsilateral lesion group, an equal number of animals received a lesion in either the left or right hemispheres. In the contralateral lesion group, six rats received lesions of the left OFC and right BLA and four rats received lesions of the right OFC and left BLA. The location of the lesion did not significantly affect choice preferences in either lesion group (all days, day × choice × side: ipsilateral lesion group: F(57,342) = 1.120, NS; contralateral lesion group: F(57,456) = 0.532, NS).
Disconnecting the OFC and BLA slowed rGT acquisition
Choice preferences
All animals changed their choice patterns throughout rGT training, similar to previous experiments (all days, day × choice, sham: F(57,513) = 2.901, p < 0.001; unilateral lesion: F(57,570) = 3.681, p < 0.001; ipsilateral lesion: F(57,399) = 6.706, p < 0.001; contralateral lesion: F(57,513) = 3.826, p < 0.001). Compared with sham-control rats, animals with a functional disconnection of the OFC and BLA (contralateral lesion group) were slower to learn the optimal strategy in the first 5 d of training (Fig. 3). Similar to the learning impairment observed following a bilateral lesion of the OFC or BLA (Zeeb and Winstanley, 2011), animals with contralateral lesions, compared with sham-control rats, initially chose the best option, P2, less and the worst option, P4, more (days 1–5, choice × group: F(3,54) = 2.931, p = 0.04; group, P2: F(1,18) = 4.191, p = 0.05; P4: F(1,18) = 7.668, p = 0.01; other options: all F < 1.023, NS). In addition, rats with contralateral lesions tended to choose P4 more often than the sham-control group during training days 6–10 (choice × group: F(3,54) = 1.590, NS; group, P4: F(1,18) = 3.552, p = 0.08; other options: all F < 2.114; NS). No further differences were observed between the contralateral lesion group and sham-control rats (all other blocks, choice × group: all F < 1.799, NS).
Although some attenuation of learning was observed when comparing the sham-control and ipsilateral lesion groups, this finding was not significant (Fig. 3; days1–5, choice × group: F(3,54) = 1.402, NS; all other blocks: all F < 0.975, NS). Furthermore, there were no significant differences between the sham-control and unilateral lesion group during task acquisition (all blocks, choice × group: all F < 2.207, NS).
Despite the learning impairment observed by the disconnected rats, analysis of the last three sessions of rGT training indicated that there were no significant differences in choice preference between the sham-control group and any lesion group (Fig. 3; choice × group: unilateral lesion vs sham: F(3,57) = 0.779, NS; ipsilateral lesion vs sham: F(3,48) = 0.110, NS; contralateral lesion vs sham: F(3,54) = 1.113, NS). Furthermore, there were no differences between the sham-control and lesion groups in the animals' combined preference of the best (P1 and P2) or worst (P3 and P4) options or the difference between these options (Table 1). These results are again similar to animals that received bilateral lesions of either the OFC or BLA before training on the rGT (Zeeb and Winstanley, 2011).
Other behavioral measurements
Data and statistical information for all other behavioral measurements are presented in Table 1. After training on the rGT, there were no group differences in perseverative responding during the punishment period or following reward. Furthermore, there were no differences in the number of trials completed or premature responses made when comparing each lesion group to the sham-control rats. Omissions remained low across all groups and rats with contralateral lesions made significantly fewer omissions than sham-control rats. Although there were no differences between groups in the latency to choose an option, animals with ipsilateral lesions were slower to collect rewards. Although significant, the magnitude of these effects was not large; therefore, these observations may not be of behavioral significance.
Acute satiety results
Choice preferences
In response to acute satiety with regular chow (food satiety), rats did not change their choice pattern compared to baseline performance (Fig. 4; satiety day × choice: sham: F(3,27) = 0.606, NS; unilateral lesion: F(3,30) = 1.296, NS; ipsilateral lesion: F(3,21) = 0.190, NS; contralateral lesion: F(3,27) = 0.524, NS). These results suggest that acute satiation with regular chow does not devalue the rewarding sucrose pellets used in the task.
In contrast, animals changed their choice preference after acute satiation with sucrose pellets (i.e., SSS; Fig. 4; satiety day × choice: F(3,27) = 5.128, p = 0.006). Specifically, in response to acute SSS, sham-control animals were less willing to tolerate the longer and more frequent punishing time-outs associated with P2 and instead chose P1 more often, a less-optimal option associated with the shortest and least frequent punishments. A similar effect was also observed in the unilateral and ipsilateral lesion groups (satiety day × choice, unilateral lesion: F(3,30) = 5.729, p = 0.003; ipsilateral lesion: F(3,21) = 5.381, p = 0.007). In contrast, a functional disconnection of the OFC and BLA prevented animals from changing their response strategy after acute SSS (Fig. 4; satiety day × choice: F(3,27) = 0.523, NS). In sum, the control groups (sham-control and the unilateral and ipsilateral lesion groups) were less tolerant of the time-out punishments following reward devaluation, whereas this manipulation did not affect decision-making in rats with an OFC-BLA disconnection.
Animals were then tested in extinction following acute SSS (SSS-Ext). Because animals were no longer rewarded, they were unable to update their representations of reward value after direct experience of the devalued reward in the test session, and therefore relied only on stored representations of reward value while performing the task. If connections between the OFC and BLA were critical for updating these reward representations, animals in the contralateral group should fail to show any changes in choice preferences under these conditions. Similar to the effects of acute SSS, there was no effect of SSS-Ext on choice preference in the contralateral lesion group (satiety day × choice: F(3,27) = 1.630, NS).
In contrast, sham-control rats changed their choice preference following SSS-Ext, increasing their choice of P1 and decreasing their choice of P2, similar to acute SSS alone. However, during extinction, these rats also tended to choose the disadvantageous options, P3 and P4, more often, which may indicate that these rats were investigating the other options more thoroughly to determine whether all options no longer yielded reward (Fig. 4; satiety day × choice: F(3,27) = 15.637, p < 0.001). A similar effect was observed in the unilateral lesion group and, to a lesser extent, also in the ipsilateral lesion group (satiety day × choice, unilateral lesion: F(3,30) = 11.559, p < 0.001; ipsilateral lesion: F(3,21) = 3.308, p = 0.04).
Trials completed
To compare the effects of satiety manipulations between groups, the difference between the total number of trials completed at baseline and each satiety manipulation was determined (Fig. 5). A similar reduced number of trials completed (from baseline), was observed in the unilateral and ipsilateral lesion groups compared with sham-control rats after all acute satiety manipulations (all t < −1.029, NS; Fig. 5). In contrast, rats with a contralateral lesion did not show such a large decrease in the number of trials completed as observed in sham-control rats after SSS (Fig. 5; t(18) = 2.371, p = 0.03). A similar effect was observed during SSS-Ext, although this result just failed to reach significance (t(18) = 1.968, p = 0.07). However, no significant differences were observed between the sham-control and contralateral lesion groups after acute satiety with regular food (t(18) = 0.168, NS). Therefore, in contrast to sham-control rats or animals with a unilateral or ipsilateral lesion, rats with an OFC-BLA disconnection did not considerably decrease the number of trials performed following either acute SSS or SSS-Ext.
Other behavioral measurements
Data and the results from statistical analysis for the other variables measured during rGT performance are presented in Table 2. All acute satiety manipulations increased the latency to choose an option and to collect reward for all rats. Likewise, the number of omissions increased and the number of premature responses made decreased for all groups. Therefore, all acute satiety manipulations similarly affected the animals' motivation to perform the rGT and to collect rewards, even though animals with a disconnection lesion failed to change their choice preference in response to any manipulation.
During extinction, perseverative responding after a trial that would have been rewarded increased in all groups. However, during SSS, increased perseverative responding after reward was observed only in the sham-control group. Interestingly, acute food satiety did not alter the number of perseverative responses after reward in any group. In contrast, rats in the unilateral or contralateral lesion group and sham-control rats decreased the number of perseverative responses made during the punishment period for all acute satiety manipulations. Rats in the ipsilateral lesion group only showed this effect after acute food satiety, although the reason for this is unclear.
Chronic satiety with chow results
Choice preferences
Sham-control rats and rats with unilateral lesions initially changed their choice preference during chronic satiety manipulations (Fig. 6). Specifically, compared to their baseline performance, rats in the sham-control group chose P2 significantly less on the first day of chronic satiation (day 1), whereas this effect occurred on days 1 and 2 for rats with a unilateral lesions (day 1, sham: satiety day × choice: F(3,27) = 4.388, p = 0.01, P2: t(9) = 3.398, p = 0.008; unilateral lesion: satiety day × choice: F(3,30) = 2.452, p = 0.08; P2: t(10) = 2.183; p = 0.05; day 2, sham: F(3,27) = 2.189, NS; unilateral lesion: satiety day × choice: F(3,30) = 4.495, p = 0.01; P2: t(10) = 3.382, p = 0.007). Choice patterns did not significantly differ from baseline in the two groups on the third or fourth day of chronic food satiety (satiety day × choice: all F < 1.565, NS). In contrast, rats in either the ipsilateral or contralateral lesion groups did not significantly change their choice preference on any chronic satiety day (Fig. 6; satiety day × choice: all F < 1.947, NS).
Trials completed
There were no differences between any lesion group and the sham-control rats for any chronic feeding day, indicating that all rats similarly decreased the number of trials completed compared with baseline (Fig. 7; all chronic satiety days: all t < 1.613, NS).
Other behavioral measurements
Because measurements of other variables did not change much during the chronic satiety period, statistical analysis for these variables was performed on the average of these data over the 4 d period (Table 2). All animals took more time to make a choice and to collect reward during chronic satiety days. Likewise, the number of omissions increased and premature responses decreased. The number of perseverative responses made were also significantly decreased in the sham-control, unilateral lesion, and contralateral lesion groups; however, there were no differences in perseverative responses made after a rewarded trial, similar to the acute satiety manipulations. These results suggest that animals were less motivated to perform the rGT after chronic satiation with regular chow.
Discussion
Disconnecting the OFC and BLA slowed acquisition of the rGT, proving for the first time that communication between these regions facilitates the development of adaptive decision-making under risk. This manipulation also prevented animals from modulating their decision-making strategy following reward devaluation via SSS, regardless of whether rats relied on stored representations of reward value (i.e., during extinction) or if the devalued reward was delivered. However, disconnected rats exhibited longer response and reward collection latencies, as well as reduced premature responding, following all satiety manipulations. Hence, disconnecting the OFC and BLA did not prevent general changes in motivational state from influencing responding, but instead prevented alterations in specific reward value from recalibrating cost-benefit decision-making.
In the rGT, animals determine the optimal strategy by integrating information regarding the probability and magnitude of expected rewards and punishments (Zeeb et al., 2009). All rats initially sampled the least from the higher reward options associated with greater loss. This is consistent with previous data (Zeeb et al., 2009, 2013; Zeeb and Winstanley, 2011) and suggests that animals readily learn that the high-risk, high-reward options are disadvantageous. In terms of the greatest number of pellets that could be earned per session, the two best options are P1 and P2. Discriminating between these options was impaired during task acquisition following an OFC-BLA disconnection, possibly due to an inability to assimilate multiple factors to determine which option yields the greatest long-term reward. As an identical learning impairment was observed after bilateral OFC or BLA lesions (Zeeb and Winstanley, 2011), a loss of communication between these areas was likely responsible for the learning deficit observed following bilateral lesions of either region.
As to the mechanism underlying this effect, disconnected animals may have placed a larger weight on one factor (e.g., reward probability), which could have biased their choice toward P1 (greatest probability of winning). However, based on previous control experiments in which either the probability of reward or the punishment duration was kept constant across all four options (Zeeb et al., 2009), the OFC-BLA disconnection did not produce the same pattern of behavior that we would expect if animals were ignoring either of these factors. Furthermore, although BLA inactivations increased choice of smaller, certain over larger, uncertain rewards in a probability-discounting task (Ghods-Sharifi et al., 2009), OFC inactivations did not affect decision-making in this paradigm (St Onge et al., 2010). Such data do not suggest that a contralateral disconnection of the OFC and BLA would shift preference toward more certain rewards in the rGT.
It may be pertinent to note that the discrimination disconnected animals struggled to resolve was between the two options associated with the smallest units of reward, the lowest probabilities of punishment, and the shortest and most similar time-out periods. Nonhuman primates with OFC damage demonstrated a decreased sensitivity to differences in reward size or duration of a delay-to-reward, especially when reward magnitudes were small or delays were short (Simmons et al., 2010). Likewise, lateral OFC lesions cause subjects to switch more rapidly between choices, particularly when the reward value is difficult to discern (Noonan et al., 2012). Furthermore, OFC lesions may enhance the “spread-of-effect,” in which the outcome of a single action reinforces not only the action that led to the outcome, but also actions that occur before or after that response, thereby impairing animals' ability to discriminate between probabilities with the narrowest range of relative values (Walton et al., 2010). Although it is unclear whether OFC-BLA connectivity contributes to this phenomenon, such data fit well with our observations that rats with an OFC-BLA disconnection were impaired at dissociating P1 from P2 during the initial learning phase of the task, when animals must keep track of multiple action-outcome events.
If disconnecting the OFC and BLA impaired the animals' ability to discriminate between options close together in value, then reducing the task difficulty by increasing the difference between the relative values of each option may eliminate the learning impairment. Alternatively, if the disconnection produced a more general deficit in integrating multiple factors to determine the best option, animals would be unimpaired when discriminating between options that only vary along one dimension, even if the difference between these options is quite small. Such hypotheses will be addressed in future experiments.
At a neuronal level, electrophysiological studies involving OFC- or BLA-lesioned rats suggest that animals with an OFC-BLA disconnection may have been slower to learn the optimal strategy on the rGT due to weaker encoding of the expected values associated with each response option (Schoenbaum et al., 2003; Saddoris et al., 2005). Neurons in the OFC may have been unable to appropriately represent the incentive value, or state information, of each outcome without input from the BLA (Schoenbaum et al., 2003; Takahashi et al., 2011). In addition, BLA neurons would have been unable to benefit from updated value representations generated by the OFC during the early stages of learning (Saddoris et al., 2005). Therefore, corticipetal and/or corticofugal pathways may contribute to forming such associations when learning the rGT.
Although OFC-BLA communication is clearly important while animals are determining the best option through trial and error, the fact that animals with disconnection lesions eventually learn the optimal strategy suggests that another brain region can bridge the gap between the OFC and BLA. Both the mediodorsal nucleus of the thalamus and nucleus accumbens are likely candidates, given their anatomical connections to the OFC and BLA (Kita and Kitai, 1990; McDonald, 1991; Turner and Herkenham, 1991; Carmichael and Price, 1995) and their involvement in behaviors relevant to rGT performance, such as behavioral flexibility and decision-making (Floresco et al., 1999, 2008a; Chudasama et al., 2001; Block et al., 2007). Future experiments will aim to determine the role of these regions in rGT performance.
In response to reinforcer devaluation, control rats chose P1 more often, as if the reward was no longer sufficiently appetitive to justify experiencing longer time-outs. In contrast, animals with a functional disconnection of the OFC and BLA failed to update their choice preference when the reward was devalued. Hence, these animals did not recalibrate their willingness to tolerate punitive time-out periods when the value of the reward declined. It could be suggested that rats with an OFC-BLA disconnection are outcome insensitive (i.e., are not using the reward to guide their choice). Yet, similar to control rats, disconnected rats appeared less motivated to perform the rGT, as indicated by increased latencies to choose an option or collect reward and decreased premature responding. Furthermore, nonhuman primates with an OFC-amygdala disconnection are able to learn new response-outcome associations despite demonstrating a deficit on a discrimination test after SSS (Baxter et al., 2000). Therefore, rats with an OFC-BLA disconnection are likely not completely outcome insensitive, but fail to modulate their decision-making strategy after a change in reward value.
The inability of SSS to alter decision-making in OFC-BLA disconnected rats—but still affect their motivational state—suggests that different neural circuitry may be involved in the motivation to obtain reward and in the assessment of value during the decision-making process. Interestingly, sating rats with regular chow likewise had similar effects on choice latency and premature responding in the absence of any change in choice preference. These results further emphasize the distinction between an animal's global motivational state and the assignment of value in the context of cost-benefit decision-making.
To further probe the utilization of information regarding reward value to guide choice, animals were tested in extinction after SSS. During extinction, animals must rely solely on stored representations of the value of each option when making decisions, rather than forming new stimulus–reward associations based on re-experience of the now devalued reward (Balleine et al., 1995; Baxter et al., 2000). Control rats demonstrated an even greater alteration in choice patterns, selecting more indiscriminately from P1, P3, and P4. This behavior perhaps indicated a return to an explorative rather than exploitative strategy. However, disconnected rats did not update their choice pattern in response to this manipulation—these rats failed to recognize that their strategy was no longer optimal and did not look for an alternative approach.
Similar to the sham-control group, reward devaluation and extinction testing altered decision-making in rats with OFC and BLA lesions in the same hemisphere; however, these rats did not show such a dramatic change in choice preferences during extinction, partially reproducing the effects of the contralateral disconnection. These data suggest that damage to the ipsilateral OFC-BLA connections prevents the optimal utilization of stored representations of value during decision-making. This is not the first time that disrupting the ipsilateral OFC-BLA connectivity has mimicked the effects of a contralateral disconnection (Lasseter et al., 2011) or that combined unilateral OFC and amygdala lesions produce a behavioral deficit (Izquierdo and Murray, 2004), suggesting that this pathway is of some behavioral significance. However, animals with damage to the ipsilateral OFC-BLA pathway are still capable of updating these representations—or forming new ones—after re-exposure to a devalued reward. In contrast, a contralateral disconnection profoundly disrupted both processes.
In sum, the results presented here demonstrate that functional connectivity between the OFC and BLA facilitates learning in a rodent analogue of the IGT and is necessary to enable appropriate behavioral adaptations after changes in reward value. Therefore, impaired communication between these regions may contribute to the decision-making deficits observed on the IGT in patients with chemical and behavioral addictions (Bechara et al., 2001; Power et al., 2011).
Footnotes
This work was supported by an operating grant from the Canadian Institutes for Health Research (to C.A.W.). C.A.W. also receives salary support through the Michael Smith Foundation for Health Research and the Canadian Institutes for Health Research New Investigator Award program.
C.A.W. has previously consulted for Theravance on an unrelated matter. The authors declare no competing financial interests.
- Correspondence should be addressed to either of the following: Dr. Fiona D. Zeeb, Centre for Addiction and Mental Health, 250 College Street, Toronto ON, M5T 1R8 Canada, fiona.zeeb{at}camh.ca; or Dr. Catharine A. Winstanley, University of British Columbia, Department of Psychology, 2136 West Mall, Vancouver BC, V6T 1Z4 Canada, cwinstanley{at}psych.ubc.ca