Abstract
In nonhuman primates, interaction between the orbitofrontal cortex (OFC) and the amygdala (AMG) has been seen as critical for learning and subsequently changing associations between stimuli and reinforcement. However, it is still unclear what the precise role of the OFC is in altering these stimulus–reward associations, and recent research has questioned whether the AMG makes an essential contribution at all. To gain a better understanding of the role of these two structures in flexibly associating stimuli with reinforcement, we reanalyzed a set of previously published data from groups of monkeys with either OFC or AMG lesions that had been tested on an object reversal learning task. Based on trial-by-trial analyses of rewarded and unrewarded choices, we report two new findings. First, monkeys with OFC lesions were, compared with both control and AMG groups, unable to use correctly performed trials to optimally guide subsequent choices. Second, monkeys with AMG lesions showed the opposite pattern of behavior. This group benefited more than controls from correctly performed trials that followed an error. Finally, as has been reported by others, after a change in reward contingencies, monkeys with OFC lesions also showed a slightly greater tendency to choose the previously rewarded object. These findings demonstrate that the OFC and AMG make different contributions to object reversal learning not highlighted previously.
Introduction
The ability to associate reinforcement with previously neutral stimuli in the environment and to subsequently change these associations is critical for adaptive reward-guided behavior and decision making. In nonhuman primates, two brain regions, the orbitofrontal cortex (OFC) and amygdala (AMG), have been considered central to this ability. Aspiration or radio frequency lesions of either structure lead to profound and reproducible deficits in object reversal learning, a task that requires flexible alteration of object–reward associations (Mishkin, 1964; Schwartzbaum and Poulos, 1965; Iversen and Mishkin, 1970; Jones and Mishkin, 1972; Aggleton and Passingham, 1981; Dias et al., 1996; Izquierdo et al., 2004). The view that emerged was one in which OFC and AMG contributed in a similar manner to a monolithic process: stimulus–reward association.
Contrary to the long-standing view, several studies now suggest that OFC and AMG make distinct contributions to altering stimulus–reward associations. For example, Izquierdo and Murray (2007) demonstrated that monkeys with selective excitotoxic lesions of the AMG learned to reverse object–reward associations as efficiently as controls. Interestingly whereas monkeys with AMG lesions were able to flexibly change stimulus–reward associations as measured by object reversal learning, they were unable to update stimulus–reward value associations in a test of reinforcement devaluation. These findings point to a distinction in the neural substrates underlying associations between objects and reward contingencies versus those underlying associations between objects and reward value. In addition, whereas damage to the OFC affects the ability of monkeys to extinguish responding to an object paired previously with reinforcement, lesions of the AMG disrupt the ability for an object associated with reinforcement to maintain responding in both extinction and second order schedules (Parkinson et al., 2001; Pears et al., 2003; Izquierdo and Murray, 2005).
Animals' choices during reversal learning are typically analyzed by counting errors taken to learn each reversal (Iversen and Mishkin, 1970; Dias et al., 1996; Chudasama and Robbins, 2003). However, such an approach does not take into account the dynamic nature of the task, because macaques often alter their choice behavior based on the outcome of only a few trials. A new approach for exploring the neural structures involved in selection tasks, such as reversal learning, has been to investigate how the outcome of single rewarded trials, unrewarded trials, or sequences of such trials influence subsequent choices (Kennerley et al., 2006).
In light of these new trial-by-trial approaches, we applied a more fine-grained analysis to the previously published data on object reversal learning from groups of monkeys that had received OFC and AMG lesions. First, as a prelude to the new analysis, we determined the number of errors made after a reversal but before a rewarded trial. This analysis was conducted to determine whether OFC or AMG lesions differentially affected the monkeys' ability to stop responding to the previously rewarded object. Then a trial-by-trial analysis of monkeys' rewarded and unrewarded choices was used to further probe the contribution of both these areas to reward-based decision making.
Materials and Methods
Subjects
Eighteen adult male rhesus monkeys (Macaca mulatta) served as subjects. Five monkeys sustained bilateral excitotoxic lesions of the amygdala (group AMG), three received bilateral aspirative lesions of the orbital prefrontal cortex (group OFC), and the remaining 10 were retained as unoperated controls (group CON). Four of the controls were tested concurrently with the AMG group, whereas the remaining six were tested with the OFC group. The training histories of all groups were highly similar. The results from the same set of monkeys have been reported previously on object reversal learning, emotional responses, reinforcement devaluation and instrumental extinction (Izquierdo et al., 2004, 2005; Izquierdo and Murray, 2005, 2007). Animals weighed between 6.2 and 12.6 kg at the start of testing. Each animal was housed individually, was kept on a 12 h light/dark cycle, and had access to water 24 h per day. All procedures were reviewed and approved by the National Institute of Mental Health Animal Care and Use Committee.
Apparatus and materials
Testing was conducted in a modified Wisconsin General Test Apparatus (WGTA) in a dark room. Monkeys were presented with a test tray (19.2 × 72.7 × 1.9 cm) situated in an illuminated test compartment. The test tray contained two food wells spaced equally (145 mm) from the center of the tray. Each well was 6 mm deep and 38 mm in diameter. For preliminary training, several dark gray matboard plaques and three junk objects were used. Two objects which were novel to the monkeys at the start of testing were used for the object reversal learning task. Throughout all training and testing, a half peanut served as a reward.
Surgery
For a full description of the surgical methods, see the study by Izquierdo and Murray (2004). For all animals, anesthesia was induced with ketamine hydrochloride (10 mg/kg, i.m.) and maintained with isoflurane (1.0–3.0%, to effect). Monkeys received isotonic fluids via an intravenous drip. Aseptic procedures were used throughout while heart rate, respiration rate, blood pressure, expired CO2, and body temperature were continuously monitored. Monkeys were treated with dexamethasone sodium phosphate (0.4 mg/kg, i.m.) and cefazolin antibiotic (15 mg/kg, i.m.) for 1 d before and for 1 week after surgery. At the conclusion of surgery and for 2 additional days, animals received ketoprofen analgesic (10–15 mg/kg, i.m.); ibuprofen (100 mg) was provided for an 5 additional days.
Orbital prefrontal cortex lesions.
Bilateral subpial aspirative lesions of the OFC were performed using a combination of electrocautery and suction. The lesions were made in two stages, balanced for hemisphere of first removal, and were intended to remove Walker's areas 11, 13, and 14, and the caudal part of area 10 (Walker, 1940). The location and extent of the intended lesion is shown in Figure 1.
Amygdala lesions.
Bilateral excitotoxic lesions of the amygdala were made in two stages. For each monkey, magnetic resonance imaging (MRI) was used to determine the stereotaxic coordinates for injections of excitotoxins. Monkeys received 18–22 injections of the excitotoxin ibotenic acid into each amygdala. At each site, which was spaced ∼2 mm apart in each plane, we injected 0.6–1.0 μl of ibotenic acid (10 μg/μl, 0.2 μl/min; Biosearch Technologies). Two animals received injections into the left amygdala in the first operation and the other three received injections into the right amygdala. During the second surgery, each animal received injections into the amygdala in the intact hemisphere. The intended lesion included the entire amygdala, including the basolateral group of nuclei as well as the central and cortical nuclei (Fig. 2).
Behavioral testing
Preliminary training.
Before formal testing, all animals were habituated to the WGTA and were allowed to retrieve food from the test tray. Through successive approximation, monkeys were trained to displace plaques and then objects placed over the food wells to retrieve rewards.
Object reversal learning.
On each trial, monkeys were presented with two objects placed over the food wells. To prevent object preferences from biasing learning scores, both objects were either baited (for half the monkeys in each group) or unbaited (remaining monkeys) on the first trial of the first session of initial learning. If the chosen object was rewarded, it was designated the S+; if not, it was designated the S−. From trial 2 onward, the food well corresponding to the S+ was baited, whereas the other was not. The monkey was allowed to displace one of the two objects and if correct to retrieve the food reward underneath. The intertrial interval was 10 s and the left–right position of the correct object was pseudorandomly determined. There was no correction after errors. Monkeys were tested for 30 trials per daily session for 5–6 d per week. Criterion was set at 93% for 1 d followed by at least 80% the next day. Once monkeys had attained criterion on the initial object discrimination problem, the contingencies were reversed and animals were trained to the same criterion as before. This procedure was repeated until a total of nine serial reversals had been completed.
Lesion assessment
The lesions in both set of animals have been assessed using MRI scans and extensively documented in previous published work from our laboratory (Izquierdo et al., 2004; Izquierdo and Murray, 2005). In addition, the brains of monkeys with OFC lesions were processed using traditional histological techniques. The location and extent of the OFC lesion in each monkey is illustrated in Figure 1. The monkeys in this group sustained an estimated 78.5% damage to the OFC. The location and extent of the AMG lesions based on T2-weighted MR scans obtained within 7 d of surgery are shown in Figure 2. Monkeys in the AMG group sustained an estimated 93.4% damage to the AMG. Images of the lesion for case AMG2 are not shown in Figure 2 because we were only able to obtain postoperative MR scans from one hemisphere.
Results
The data from the object reversal learning task have been analyzed previously by comparing the number of errors per reversal between groups (Izquierdo et al., 2004; Izquierdo and Murray, 2007). In addition, an analysis of errors by stage (below chance, near chance, above chance) was performed. For the present study, we conducted a more fine-grained analysis of the monkey's choices. Because all monkeys had been tested for at least 2 d after each reversal, we restricted our analysis to this data set. Accordingly, data from the two sessions (60 trials) immediately after each of the nine reversals were analyzed, making a total of 540 trials for each monkey. No statistically significant differences were found between the two CON groups (p > 0.1, all comparisons) so their data were collapsed into one set for analysis. The OFC group scored more total errors across the nine reversals compared with both CON and AMG groups (one-way ANOVA, effect of group, F(2,15) = 12.1, p = 0.001; post hoc Bonferroni's tests: OFC vs CON or AMG groups, p < 0.002; CON vs AMG, p > 0.5) (Fig. 3). Thus, in line with our earlier analysis of the entire data set from these same groups of monkeys (Izquierdo et al., 2004; Izquierdo and Murray, 2007), the OFC group but not the AMG group showed a reversal learning impairment.
We next divided the total number of errors scored by each monkey across all reversals into two categories: (1) errors made after a reversal but before a reward had been obtained, and (2) errors scored after the first correct trial after a reversal until the next change in contingencies (Fig. 3, Before 1st correct, After 1st correct, respectively). A three (group) by two (error type) repeated-measures ANOVA revealed a significant main effect of error type (F(1,15) = 209.33; p < 0.001), reflecting the greater number of errors made after the first correct trial relative to those before the first correct trial (Fig. 3). The ANOVA also revealed a significant interaction of group by error type (F(2,15) = 30.93; p < 0.001), indicating that not all groups performed similarly across the two error types. A one-way ANOVA of the total number of errors made before the first correct trial revealed that the OFC group made more errors of this type than the CON but not the AMG group (F(2,15) = 3.907, p = 0.043; post hoc Bonferroni's tests: OFC vs CON, p = 0.043; OFC vs AMG, p > 0.1; CON vs AMG, p > 0.5). In addition, the OFC group made significantly more errors after the first correct trial than both CON and AMG groups (F(2,15) = 19.64, p < 0.001; post hoc Bonferroni's tests: OFC vs CON or AMG, p < 0.001; CON vs AMG, p > 0.5).
These findings suggest that the OFC and AMG groups' choices may be influenced differentially by correctly performed trials. To investigate this possibility, we used an analysis intended to tease apart the effect of correctly and incorrectly performed trials on subsequent performance. The method developed by Kennerley et al. (2006) to investigate monkeys' performance on a reinforcement-guided action selection task involves several steps. First, monkeys' performance (percentage correct) on the trials immediately after either a correct choice (C + 1) or an error (E + 1) was determined (Fig. 4). A one-way ANOVA revealed that the OFC group performed significantly worse than both CON and AMG groups on trials after a correct response (group, F(2,15) = 15.381, p < 0.001; post hoc Bonferroni's tests: OFC vs CON, p = 0.001; OFC vs AMG, p < 0.001). The AMG group did not differ from CON group after correct responses (p > 0.5). Although there was an effect of group on E + 1 trials, post hoc tests did not reveal any significant differences between groups [one-way ANOVA, group, F(2,15) = 6.071, p < 0.05; post hoc Bonferroni's test, p > 0.05 all comparisons; 1 − β error probability (observed power) = 0.4].
The second step of the analysis further investigated the effect of correct choices on behavior by examining whether the number of consecutive correct (C) trials performed after an error (E) influenced performance. In this “EC analysis,” scores are tabulated for every trial following an error (E + 1), and for every instance in which an error is followed by a single correct trial (EC + 1), by two correct trials [EC(2) + 1], and so on (Fig. 5, top). For the present analysis, we applied an arbitrary rule that every monkey had to have at least 10 instances of each trial type; accordingly, we were able to include trials types EC + 1 to EC(4) + 1 (Fig. 5, bottom). A three (group) by five (trial type) repeated-measures ANOVA on performance revealed significant main effects of trial type (F(4,60) = 70.31; p < 0.001) and group (F(2,15) = 13.65; p < 0.001), as well as a significant interaction between group and trial type (F(8,60) = 2.96; p = 0.009). To explore these effects, one-way ANOVAs of the different trial types were conducted and confirmed that the OFC group performed significantly worse than both CON and AMG groups for trial types EC(2) + 1 to EC(4) + 1 (F(2,15)>8.8; p < 0.003; post hoc Bonferroni's tests, OFC vs CON or AMG, p < 0.003). In contrast, the AMG group performed significantly better than both CON and OFC groups on EC + 1 trials (effect of group, F(2,15) = 6.329, p = 0.01; post hoc Bonferroni's test, AMG vs OFC or CON, p < 0.05). Importantly, differences in the performance of the two lesion groups compared with the CON group could not be ascribed to a disparity in the number of instances of each trial type. For those trial types on which the performance of either the AMG or OFC lesion groups differed from that of the CON group [AMG, EC + 1; OFC, EC(3–4) + 1], there were comparable numbers of trials considered in the analysis (one-way ANOVAs, p > 0.5). In addition, if the EC analysis is run on data controlling for the number of correct trials after each reversal, the findings we report remain the same.
The third step of the analysis examined the effect of multiple errors [E + 1, EE + 1, EE(2) + 1, etc.] on subsequent performance. This “EE analysis,” which was conducted in a manner that paralleled the EC analysis, failed to reveal any significant differences between the groups (group by trial type interaction, p > 0.1; 1 − β error probability = 0.57).
Discussion
Impairment in object reversal learning is a hallmark of damage to OFC and, until recently, the same was thought to be true for the AMG as well. In light of the findings of Izquierdo and Murray (2007) and in an attempt to understand the nature of the changes in behavior following lesions of these two interconnected areas, we conducted a more fine-grained analysis of already published data, one intended to tease apart the effects of correctly and incorrectly performed trials on subsequent performance.
The new analyses confirmed the significant impairment on object reversal learning after OFC but not AMG lesions, consistent with previous findings from these monkeys (Izquierdo et al., 2004; Izquierdo and Murray, 2007). In addition, consistent with previous reports, monkeys with OFC lesions exhibited a mild but significant tendency to make more errors than controls in the period after a reversal but before a rewarded choice. The new trial-by-trial analysis yielded two new findings. First, monkeys with OFC lesions failed to benefit as much as control and AMG groups from correctly performed trials. Second, monkeys with selective excitotoxic AMG lesions benefited more than controls from a correctly performed trial, when the correct trial immediately followed an error (EC + 1). There were no effects of AMG lesions on any other measures of performance. These findings provide new insights into the role of the OFC and AMG in object reversal learning in particular and reward-based decision making in general.
OFC, perseveration, and reward
The majority of previous reports have highlighted the perseverative nature of animal's choices after OFC lesions (Jones and Mishkin, 1972; Dias et al., 1996; Chudasama and Robbins, 2003) (cf. Iversen and Mishkin, 1970). The aforementioned studies defined perseverative behavior using a stage analysis developed by Jones and Mishkin (1972). In this scheme, stage 1 errors, defined as all errors occurring in sessions in which the score falls below chance (i.e., >20 errors in 30 trials), are considered perseverative. Using this same stage analysis, however, Izquierdo and Murray (2004) failed to confirm that monkeys with OFC lesions (the same monkeys reported here) made more perseverative errors. This difference was attributed to the more restricted OFC lesion used relative to the one studied by Jones and Mishkin (1972). The present report, which uses a different analysis applied to a subset of the same data, shows that the monkeys studied by Izquierdo et al. (2004) were mildly impaired relative to controls in the number of errors scored after a reversal but before a correct choice was made. This finding potentially brings the data into line with several other studies, including those cited above, as it suggests the monkeys with OFC damage were “perseverating” to some extent. This deficit may be linked to deficits in instrumental extinction reported in the same animals (Izquierdo and Murray, 2005).
The inability of monkeys with OFC lesions to optimally use reward information from correctly performed trials to guide their subsequent choices has not been emphasized before. Not only did monkeys with OFC lesions perform worse than controls once they had made a correct choice after a reversal (Fig. 3, After 1st correct), but even after having made four consecutive correct choices they were still less likely than controls to select the correct object on the next trial [Fig. 5, EC(4) + 1]. We note that this apparent inability to use reward information is only observed after a reversal and does not affect discrimination learning as a whole; the OFC group was able to learn the initial discrimination at the same rate as controls. Whether deficits in using either errors or correct information are the result of disruption to common or distinct processes is explored below.
Amygdala and incentive value
The finding that lesions of the AMG facilitated monkey's performance is surprising given that, originally, the AMG was thought to be essential for normal performance on object reversal learning. As indicated previously, however, although deficits on object reversal learning have been found after aspiration or radio frequency lesions of the AMG in monkeys (Schwartzbaum and Poulos, 1965; Jones and Mishkin, 1972; Aggleton and Passingham, 1981), excitotoxic lesions do not produce such deficits (Izquierdo and Murray, 2007). However, the present reanalysis shows that “lack of impairment” is an incomplete description. From the EC analysis, it is apparent that the AMG group were more likely than controls to continue to choose the rewarded object if they have already been rewarded for choosing it once before, but only when that correct choice followed an error (Fig. 5, EC + 1). It is possible that similar effects on other trial types [EC(2–4) + 1] were obscured by ceiling effects.
The facilitation of the performance of the AMG group complements a recent report in which the reversal learning performance of two human subjects was enhanced by AMG damage (Hampton et al., 2007). It is also reminiscent of the performance of the same group on instrumental extinction (Izquierdo and Murray, 2005), in which they took fewer trials than control monkeys to extinguish responding to a previously rewarded object. Despite differences between instrumental extinction and object reversal learning tasks (e.g., extinction assesses responses to a single object, whereas object reversal requires a choice between two objects), it may be possible to understand the pattern of results within a single theoretical framework.
Izquierdo and Murray (2005) proposed that the AMG lesion-induced facilitation in extinction might be attributable to either an increased sensitivity to nonrewarded trials or an inability to represent the incentive value of the reward. The results of the EE analysis and the finding that the AMG group made just as many “before first correct” errors as controls argues against the first explanation. Their second proposal, that AMG lesions degraded representations of incentive value, at first seems untenable because monkeys with AMG lesions were actually better than controls at using reward information to guide subsequent choices. However, it may be possible to account for both data sets with this hypothesis by considering the contribution of other structures, such as the inferior temporal cortex (IT) and OFC, in monkeys with AMG lesions during reversal learning.
Frontotemporal interaction during reward-based decision making
Like OFC lesions, removal of parts of the IT, specifically the rhinal cortex, in monkeys yields reversal learning impairments (Murray et al., 1998). Recent research indicates that these impairments occur because functional interaction between the IT and OFC is necessary for acquiring and implementing visually guided rules (Bussey et al., 2002; Browning et al., 2007). In these cases, the occurrence of food reward provides information distinct from its hedonic properties, for example, by instructing visually guided rules such as win–stay or win–shift (Gaffan, 1985). Unlike the IT, the AMG is not essential for visually guided rules (Murray and Wise, 1996), but is necessary for representing the incentive value of a reward and updating this value (Málková et al., 1997; Parkinson et al., 2001; Wellman et al., 2005; Izquierdo and Murray, 2007). Furthermore, direct functional interaction between the OFC and AMG is essential for updating reward value (Baxter et al., 2000). Based on these findings, it has been proposed that there may be two OFC–temporal lobe circuits that drive reward-based decision making (Murray, 2007). One circuit including the IT and OFC has been hypothesized to process the sensory or informational properties of reward (i.e., its occurrence or not), whereas the other, including the OFC and AMG, would process the affective or hedonic properties of reward.
Without affective input from the AMG to the OFC, monkey's choices may be dependent on OFC–IT interactions alone and therefore driven by visually guided rules, sensory–sensory associations, or a combination of the two. The findings from an odor-guided reversal learning study in rats with AMG lesions suggests that this may be the case (Schoenbaum et al., 2003). Although rats with AMG lesions were able to reverse cue–outcome associations as efficiently as sham-operated controls, Schoenbaum et al. (2003) reported that OFC neurons in rats with AMG lesions did not encode odor–outcome associations, at least not during the time the odor was present. Instead, OFC neurons were more likely to be activated by the sensory properties of the cues. These findings are consistent with the idea that the choices of group AMG are more strongly guided by the occurrence of food reward on the most recent trial because there is no affective signal to bias them toward choosing the previously rewarded object. Similarly, without a representation of the incentive value of the reward to motivate responding, monkeys would also extinguish to a previously rewarded object more quickly compared with controls.
The clear dissociation between the influence of rewarded versus nonrewarded trials on subsequent choices is important because it demonstrates that monkeys with OFC lesions are impaired at integrating reward and, to a lesser extent, error information to dynamically guide their selections. Paradoxically, the inability of monkeys with OFC lesions to use reward information to guide choice after reversals may be the result of less flexible coding in the AMG. Without input from the OFC during odor-guided reversal learning, AMG neurons in rats are less flexible and show disrupted outcome-expectancy encoding across reversals (Saddoris et al., 2005). Less flexible encoding in the AMG might account not only for an inability to benefit from correctly performed trials, but also for the increased number of errors before the first correct trial and extinction deficits after OFC lesions. Alternatively, an increased number of errors before the first correct response might reflect disrupted encoding in the IT. An important avenue for future research will be to determine whether similar effects are seen in monkeys and also how the activity of neurons in other interconnected areas, including the IT and portions of the striatum, is altered after OFC lesions.
In sum, our reanalysis indicates that OFC lesions reduced, whereas AMG lesions enhanced, the influence of correctly performed trials on subsequent choices. This explanation implies that affective representations established in the AMG actually impede reward-based decision making after a reversal; that is, without such affective signals to bias choices during reversal learning, monkeys may behave more rationally.
Footnotes
-
This work was supported by the Intramural Research Program of the National Institute of Mental Health. We thank A. Izquierdo, R. Suda, and K. Wright for testing and data collection.
- Correspondence should be addressed to Dr. Peter H. Rudebeck, Section on the Neurobiology of Learning and Memory, Laboratory of Neuropsychology, National Institute of Mental Health–National Institutes of Health, Building 49, Suite 1B80, 49 Convent Drive, Bethesda, MD 20892-4415. rudebeckp{at}mail.nih.gov