The macaque orbital prefrontal cortex (PFo) has been implicated in a wide range of reward-guided behaviors essential for efficient foraging. The PFo, however, is not a homogeneous structure. Two major subregions, distinct by their cytoarchitecture and connections to other brain structures, compose the PFo. One subregion encompasses Walker's areas 11 and 13 and the other centers on Walker's area 14. Although it has been suggested that these subregions play dissociable roles in reward-guided behavior, direct neuropsychological evidence for this hypothesis is limited. To explore the independent contributions of PFo subregions to behavior, we studied rhesus monkeys (Macaca mulatta) with restricted excitotoxic lesions targeting either Walker's areas 11/13 or area 14. The performance of these two groups was compared to that of a group of unoperated controls on a series of reward-based tasks that has been shown to be sensitive to lesions of the PFo as a whole (Walker's areas 11, 13, and 14). Lesions of areas 11/13, but not area 14, disrupted the rapid updating of object value during selective satiation. In contrast, lesions targeting area 14, but not areas 11/13, impaired the ability of monkeys to learn to stop responding to a previously rewarded object. Somewhat surprisingly, neither lesion disrupted performance on a serial object reversal learning task, although aspiration lesions of the entire PFo produce severe deficits on this task. Our data indicate that anatomically defined subregions within macaque PFo make dissociable contributions to reward-guided behavior.
Striking changes in reward-guided behavior and affect are characteristic of damage to or dysfunction within the orbital prefrontal cortex (PFo) in both humans and nonhuman animals (Schoenbaum and Roesch, 2005; Murray et al., 2007). Lesions of the PFo have been reported to disrupt a host of different reward-related processes including, but not limited to, the ability to flexibly alter object–reward relationships (Butter et al., 1963; Iversen and Mishkin, 1970; Jones and Mishkin, 1972; Dias et al., 1996; Fellows and Farah, 2005), update the value of specific object–reward associations (Izquierdo et al., 2004; Machado and Bachevalier, 2007; Baxter et al., 2009), make appropriate choices based on preferences (Baylis and Gaffan, 1991; Fellows, 2006; Machado and Bachevalier, 2007), and learn probabilistic stimulus–reward associations (Rudebeck et al., 2008; Walton et al., 2010). Whether these alterations in behavior are caused by disruption to a single function subserved by the PFo or whether they are separable remains an outstanding question.
The macaque PFo is comprised of at least two networks or subregions that differ in cytoarchitecture and pattern of anatomical connections (Carmichael and Price, 1996). One of these subregions encompasses Walker's areas 11 and 13 on the orbital surface of the macaque frontal lobe. It receives inputs from all sensory modalities via connections with sensory cortical areas and is also reciprocally connected with the amygdala (Carmichael and Price, 1995a,b). By contrast, the other subregion, which centers on Walker's area 14 and extends onto the medial aspect of the hemisphere, receives almost no inputs from sensory cortical areas. Instead, it is densely interconnected with structures that modulate autonomic arousal and with other parts of the medial frontal cortex (Carmichael and Price, 1995a,b). Given these distinctions in anatomical connectivity, it has been suggested that these two subdivisions of the PFo make distinct contributions to reward-guided behavior and foraging (Saleem et al., 2008). Direct evidence for this hypothesis, however, is scarce. Damage to the PFo in humans is rarely confined to one subregion (Hornak et al., 2004; Fellows and Farah, 2005), and lesion studies in monkeys have generally focused on determining the function of the PFo as a whole (Jones and Mishkin, 1972; Izquierdo et al., 2004).
To explore the contributions of these PFo subregions to behavior, we studied two separate groups of rhesus monkeys (Macaca mulatta) that had received excitotoxic lesions of either areas 11/13 or area 14. The two subregions together compose PFo (areas 11, 13, and 14). The performance of these two groups as well as that of a group of unoperated controls was assessed on a series of tasks sensitive to lesions of PFo as a whole. Because recent human neuroimaging studies have suggested that medial parts of the PFo may be involved in representing the value of different options (Boorman et al., 2009), we also conducted an object preference task; this test was intended to assess each monkey's ability to select objects based on the relative value of their associated food rewards, independent of learning object–food associations.
Materials and Methods
Twelve adult male rhesus monkeys (M. mulatta) served as subjects. Four monkeys sustained bilateral excitotoxic lesions of Walker's areas 11 and 13 (group 11/13), four received bilateral excitotoxic lesions of Walker's area 14 (group 14), and the remaining four were retained as unoperated controls (CON group). Monkeys weighed between 5.1 and 10.0 kg and all were at least 4.5 years old at the start of testing. Each animal was housed individually, kept on a 12 h light/dark cycle, and had access to water 24 h a day. All procedures were reviewed and approved by the NIMH Animal Care and Use Committee.
Apparatus and materials
For reinforcer devaluation, object reversal learning, object preference, and extinction tasks, testing was conducted in a modified Wisconsin General Test Apparatus (WGTA) in a dark room. The WGTA was divided into an animal compartment, which fit a large wheeled transport cage, and a test compartment, which held a test tray. Two test trays (19.2 × 72.7 × 1.9 cm)—one two-well tray and one three-well tray—were available for testing. The two-well test tray contained two food wells spaced equally (235 mm apart) from the center of the tray, whereas the three-well tray contained three food wells, one in the center of the tray and two equally spaced to the left and right (145 mm from center). Each well was 6 mm deep and 38 mm in diameter. During behavioral test sessions, the animal compartment was unlit and the test compartment was illuminated. An opaque screen that could be raised or lowered by the experimenter separated the two compartments. An additional screen on the back of the WGTA, also controlled by the experimenter, allowed one-way viewing of the test compartment. For preliminary training, several dark gray matboard plaques and three junk objects were used. For reinforcer devaluation and object preference testing, two different sets of 120 novel “junk” objects were used. Objects varied widely in color, shape, and size. An additional pair of objects was available for object reversal learning, and yet another novel object was used for extinction. For food rewards we used six different food items: M&M's (Mars), half peanuts, raisins, Craisins (Ocean Spray), banana-flavored pellets (Noyes), and fruit snacks (Giant Foods).
Anesthesia was induced with ketamine hydrochloride (10 mg/kg, i.m.) and maintained with isoflurane (1.0–3.0%, to effect). Monkeys received isotonic fluids via an intravenous drip. Aseptic procedures were used throughout. Heart rate, respiration rate, blood pressure, expired CO2, and body temperature were continuously monitored. Monkeys were treated with dexamethasone sodium phosphate (0.4 mg/kg, i.m.) and cefazolin antibiotic (15 mg/kg, i.m.) for 1 d before and for 1 week after surgery. During surgery monkeys were given 30 ml of mannitol (20%, 1 ml/min, i.v.) to increase access to the orbital surface and control edema.
A large bilateral bone flap was made over the region of the prefrontal cortex and two separate dura flaps were opened to allow access to the orbital surface in each hemisphere. Bilateral excitotoxic lesions of cortex corresponding to areas 11/13 and 14 were made by visually guided injections using a modified 30 gauge Hamilton syringe with the aid of an operating microscope. At each injection site 1.0 μl of ibotenic acid (10–15 μg/μl; Sigma or Tocris Bioscience) was injected into the cortex as a bolus. The needle was then held in place for 2–3 s to allow the toxin to diffuse away from the injection site. Injections were spaced ∼2 mm apart. Except where noted, all operations were carried out in a single stage.
For the area 11/13 lesion, a mean of 66 injections (range: 57–73) was made in each hemisphere. Injections were made into the cortex on the orbital surface between the fundus of the lateral orbital sulcus and the medial edge of the medial orbital sulcus. The rostral boundary of the injections was a line joining the tips of the medial and lateral orbital sulci. The caudal boundary of the injections was a line joining the most caudal points of the medial and lateral orbital sulci (Fig. 1). For the purpose of relating the location of our intended lesion to other commonly employed anatomical frameworks, we note that the cortex included in the 11/13 lesion corresponds roughly to areas 13l, 13m, 13b, 11l, and 11m of Carmichael and Price (1994).
For the area 14 lesion, a mean of 47 injections (range, 32–64) was made into each hemisphere to cover the area between the medial edge of the medial orbital sulcus and the rostral sulcus on the medial surface of the hemisphere. The rostral boundary of the injections was the rostral tip of the medial orbital sulcus, whereas the caudal boundary was the most caudal point of the medial orbital sulcus. The cortex included in the intended lesion corresponds approximately to areas 14r, 14c, and 10m of Carmichael and Price (1994). To inject excitotoxin into the cortex on the medial surface below the rostral sulcus, the Hamilton syringe was advanced ∼5–9 mm from the surface of the orbital cortex just lateral to the olfactory tract, through the white matter of the ventromedial frontal cortex, into the cortex on the medial surface, and the injection was made. In one monkey in the area 14 group (case 3), the postoperative T2-weighted MR scan (see next section, Lesion assessment) suggested that the lesion had not been completely successful. A second operation was performed in which additional injections of ibotenic acid were made.
At the conclusion of surgery the wound was closed in anatomical layers. For 2 d after surgery animals received ketoprofen analgesic (10–15 mg/kg, i.m.), and ibuprofen (100 mg) was provided for five additional days. Monkeys were given at least 2 weeks recovery from surgery before behavioral testing resumed.
Lesions of areas 11/13 and area 14 were assessed using T2-weighted MRI scans obtained within 1 week of the surgery. The location and extent of excitotoxic lesions is reliably indicated by white hypersignal on T2-weighted scans (Málkova et al., 2001). Accordingly, for each operated monkey the extent of hypersignal on coronal MR images between ∼40 and 26 mm anterior to the interaural plane was plotted onto a standard set of drawings of coronal sections from a macaque brain. The location and extent of the lesions was then measured using a digitizing tablet (Wacom). The injections of ibotenic acid into areas 11/13 reliably resulted in hypersignal visible in T2-weighted MR scans on the orbital surface between the medial and lateral orbital sulci. The hypersignal also typically extended just medial to the medial orbital sulcus and into the medial bank of the lateral orbital sulcus, again as intended (Fig. 1). We estimated that the injections destroyed a mean of 85.8% (range, 81.7–91.8) of areas 11/13 (Table 1). The hypersignal also extended beyond the boundaries of the intended lesion; in some cases hypersignal was evident in caudal area 13 (area 13a of Carmichael and Price, 1994) and in the adjacent portion of the ventral striatum.
As intended, injections into area 14 reliably resulted in hypersignal visible in T2-weighted MR scans between the medial orbital sulcus and the rostral sulcus on the medial surface of the hemisphere (Fig. 2). We estimated that the injections destroyed a mean of 79.3% (range, 58.7–96.0) of area 14 (Table 1). In some cases the MR scans revealed hypersignal outside the boundaries of the intended lesion; in cases 2 and 4, hypersignal was evident in area 25 and ventral area 32.
Before surgery all animals were habituated to the WGTA and were allowed to retrieve food from the test tray. Through successive approximation, monkeys were trained to displace plaques and then objects placed over the food wells to retrieve food rewards. Following preliminary training and initial food preference testing, monkeys either received excitotoxic lesions within PFo or were retained as unoperated controls. They were then tested on a battery of tasks described below that was administered to all monkeys in the same order.
Food preference testing
After the monkeys had been habituated to the WGTA, each monkey's preference for six different foods was assessed over a 15 d period. Each day monkeys received 30 trials consisting of pairwise presentation of the six different foods, one each in the left and right wells of the test tray. On each trial the opaque screen was raised and the monkey was allowed to choose one of the foods. Once a choice had been made the screen was lowered and the experimenter noted the type of food chosen. There was a 10 s intertrial interval (ITI). Each food was paired with each of the other foods twice within a session. The left/right position of the foods was counterbalanced across trials. Preferences were determined by totaling the number of choices for the six different foods over the final 5 d of testing, as well as analyzing choices within each pair of foods.
Colwill and Rescorla (1985) showed that selective satiation can be used to probe whether animals have learned specific response–outcome relationships (Colwill and Rescorla, 1985). Here we used a reinforcer devaluation task that employed a selective satiation procedure to test whether monkeys could learn and update specific object–outcome associations. Previous work from this laboratory has shown that lesions of the PFo as a whole lead to impairments in updating the stimulus–outcome associations following selective satiation (Izquierdo et al., 2004). The present experiment was therefore conducted to determine which part of the PFo is important for this updating function.
Object discrimination learning.
Following surgery, monkeys were trained to discriminate 60 pairs of novel objects. For each pair, one object was randomly designated as the positive object (S+) and was associated with a food item, while the other was designated as negative (S−) and was not associated with a food item. Half of the positive objects were associated with food 1, the other half were associated with food 2. For each monkey, the identity of foods 1 and 2 was based on the monkey's previously determined food preferences. The foods selected were those that the monkey valued highly and were chosen on a roughly equal basis. Each trial started with the opaque screen being raised. Monkeys were presented with a pair of objects, one each overlying a food well, and were allowed to choose between them. If they displaced the S+ they were allowed to retrieve the food in the well underneath it before the trial was terminated by lowering the opaque screen. If they chose the S−, no food was available, and the trial was terminated without correction. The ITI was 20 s. At the completion of each trial the experimenter recorded the monkey's choice. Monkeys were given one 60-trial session per day. Each pair of objects was presented only once during a session, and the 60 pairs were presented in the same order each day. The left/right position of the S+ followed a pseudorandom order. Training continued until monkeys attained the criterion of a mean of 90% correct responses over five consecutive days (i.e., 270 correct of300 trials).
Reinforcer devaluation test 1.
Once monkeys had learned the 60 pairs, their choices of objects were assessed under two conditions: after one of the foods was devalued, and in normal (baseline) conditions. On separate days we conducted four test sessions, each consisting of 30 trials. Only the positive objects, those associated with foods, were used. On each trial, a food-1 object and a food-2 object were presented together for choice; each object covered a well baited with the appropriate food. Monkeys were allowed to select one object and to retrieve the food in the well underneath it. Trials were separated by 20 s. At the completion of each trial the experimenter recorded the object the monkey had displaced and whether or not the monkey had retrieved the food underneath the chosen object. We were unable to determine whether the retrieved food was consumed because the opaque screen was lowered between trials, blocking the experimenter's view of the monkey.
Preceding two of the test sessions, a selective satiation procedure intended to decrease the value of one of the foods was conducted. For the other two test sessions, which provided baseline scores, monkeys were not sated on either food before being tested. The order in which the test sessions occurred was the same for all monkeys and was as follows: (1) baseline test 1, neither food sated before test session; (2) food 1 devalued by selective satiation before test session; (3) baseline test 2, neither food sated before test session; (4) food 2 devalued by selective satiation before test session. After sessions that were preceded by selective satiation, monkeys were given 2 d rest. Between each test session monkeys were given one training session with the original 60-pair object discrimination task. The latter procedure was carried out to ensure that monkeys were still willing to select and consume both foods and that there were no long-term effects of the satiation procedure.
For the selective satiation procedure, a food box filled with a preweighed quantity of either food 1 or food 2 was attached to the front of the monkey's home cage. The monkey was given 15 min unobserved to eat as much as it wanted. At the end of the 15 min the food box was checked to see whether the monkey had consumed all of the food. If the box was empty it was refilled. Thirty minutes after the food box had first been attached to the home cage, an experimenter started to observe the monkey's behavior. The selective satiation procedure was deemed to be complete when the monkey refrained from retrieving food from the box for 5 min. The experimenter noted the amount of time taken in the selective satiation procedure and the total amount of food consumed by each monkey. The monkey was then taken to the WGTA within 10 min and the test session was conducted.
Reinforcer devaluation tests 2 and 3 (food only choices).
A second devaluation test, identical to the first, was conducted between 20 and 48 d after reinforcer devaluation test 1. Monkeys were retrained on the same 60 pairs to the same criterion as before. After relearning, the reinforcer devaluation test was conducted in the same manner as before.
In reinforcer devaluation test 3 the effect of selective satiation on monkey's choices of foods alone was assessed. This test was conducted to evaluate whether satiety transferred from the home cage to the WGTA and whether behavioral effects of the lesion (if any) were due to an inability to link objects with food value as opposed to simply valuing the foods. This test was identical to both reinforcer devaluation tests 1 and 2, but with the important difference that no objects were presented over the two wells where foods were placed. On each trial of the 30-trial sessions, monkeys could see the two foods and were allowed to choose between them. As was the case for reinforcer devaluation tests 1 and 2, there were four critical test sessions: two were preceded by selective satiation and two were not.
Object reversal learning
Immediately following all reinforcer devaluation tests, monkeys were tested on object reversal learning. On each trial monkeys were presented with a single pair of objects placed over the food wells on a two-well tray; through trial and error they could learn which object was associated with a food reward. Both objects were novel at the beginning of testing. To prevent object preferences from biasing learning scores, both objects were either baited (for half the monkeys in each group) or unbaited (remaining monkeys) on the first trial of the first session of initial object discrimination learning. If the object chosen on the first trial was rewarded, it was designated the S+; if not, it was designated the S−. From trial 2 onward the food well covered by the S+ was baited whereas the food well covered by the S− was not. The monkey was allowed to displace one of the two objects and, if correct, to retrieve the food reward underneath. Throughout all training and testing a half peanut served as a reward. The ITI was 10 s and the left-right position of the correct object followed a pseudorandom order. There was no correction after errors. Monkeys were tested for 30 trials per daily session for 5–6 d per week. Criterion was set at 93% (i.e., 28/30) for 1 d followed by at least 80% (i.e., 24/30) the next day. Once monkeys had attained criterion on the initial object discrimination problem, the contingencies were reversed (i.e., the S+ became the S− and vice versa) and animals were trained to the same criterion as before. This procedure was repeated until a total of nine serial reversals had been completed.
Object preference task
At the completion of the object reversal learning task, an object preference task was conducted. This test was intended to assess each monkey's ability to select objects based on the relative value of their associated food rewards, independent of learning object–food associations. As was the case for the reinforcer devaluation tests, the task was conducted in two phases: an object discrimination learning phase followed by a preference test phase. In the first phase, monkeys learned object–food associations in the course of a standard object discrimination learning task. In the test phase, the ability of monkeys to use their subjective food preference to appropriately guide their choice of these same objects was assessed. No satiation was involved.
Object discrimination learning.
Monkeys were trained to discriminate a novel set of 60 pairs of objects. As before, one object in each pair was randomly designated as positive (S+) and was always baited with a food reward; the other object in the pair was designated as negative (S) and was never baited. Twelve positive objects were associated with each of five different foods (M&M's, fruit snack, peanut, raisin, banana pellet). The object–food assignments were fixed throughout testing. For one monkey in the 11/13 group, Craisins were substituted for M&M's part way through acquisition because the monkey was unwilling to retrieve the M&M's. On each trial monkeys were presented with a pair of objects, one positive and one negative, and were allowed to choose between them by displacing one of the two objects. All other procedures were identical to those used for acquisition of the first set of 60 discrimination problems.
Once monkeys had reached criterion, their object preferences were assessed over four 30-trial test sessions. In each session, only the positive objects were used. Monkeys were presented with two objects on each trial and could choose between them; all objects were baited with the same food used during the object discrimination learning phase. Objects presented together for choice were always associated with different foods. Once monkeys had selected an object they were allowed to retrieve the food underneath, thereby terminating the trial. Trials were separated by 20 s. Each object was presented a single time during a test session, and no two objects were presented together more than once during the four test sessions. Objects associated with a particular food were paired with those of all the other foods to the same degree, and the left-right position of the object types followed a pseudorandom order.
Monkeys were presented with a single novel object that covered the central well of a three-well test tray. On each trial monkeys were given 30 s to displace the object and retrieve the food hidden underneath. If the monkey retrieved the food, the trial was scored as correct; however, the trial ended only after the full 30 s had elapsed. If the monkey failed to retrieve the food within the limit of 30 s, it was scored as an omission. At the end of each 30 s trial the screen was lowered. Trials were separated by 15 s. Monkeys were tested at the rate of 30 trials per session, one session per day. Acquisition was considered complete when monkeys made 28 correct responses of 30 for five consecutive sessions (i.e., 140 correct responses in 150 trials).
Following acquisition, monkeys received five consecutive extinction sessions. As was the case for acquisition, each session consisted of 30 trials in which the object was presented for 30 s, separated by ITIs of 15 s. In contrast to the acquisition phase, however, no food was placed in the food well beneath the object. On each trial the experimenter recorded whether or not the monkey displaced the object within the 30 s trial period.
Where appropriate, the data were analyzed by SPSS statistical software using repeated-measures ANOVA with Hynh–Feldt correction. Analyses used test (reinforcer devaluation, two levels), reversal (object reversal learning, nine levels), and block and session (extinction; five levels each) as within-subject factors and group (three levels) as a between-subjects factor. Further post hoc analyses were conducted using simple main effects to explore any significant main effects or interactions (p < 0.05).
Object discrimination learning
Monkeys learned the initial 60 discrimination problems in a mean of 8.75 sessions (range, 4–15). The three groups learned at a similar rate (mean sessions to criterion: one-way ANOVA, F(2,9) = 1.41, p > 0.25; mean total errors: one-way ANOVA, F(2,9) = 1.12, p > 0.3).
Reinforcer devaluation tests 1 and 2
Following selective satiation, monkeys in the CON group consistently chose objects associated with the higher-value (nonsated) food across tests 1 and 2. This was manifest as a shift in object choices from baseline. By contrast, monkeys with excitotoxic lesions within PFo either did not show the same shift in choices (group area 11/13) or were less consistent in their choices across the two tests (group area 14). To determine the degree to which subject's choices changed in sessions following selective satiation compared to baseline sessions, a “difference score” (DS) was calculated for each monkey. The DS was determined by separately computing the mean number of choices each monkey made of food 1- and food 2-associated objects in the two baseline sessions [Equation 1, X(a) and X(b), respectively]. The number of object choices following selective satiation for a particular food, either food 1-associated objects or food 2-associated objects (A and B, respectively) was then subtracted from the mean baseline score for each object type. Finally, these scores were summed to produce an overall difference score. Higher difference scores reflect a greater shift from baseline following selective satiation and therefore a greater sensitivity to the current value of the foods: These difference scores were compared across the two tests (reinforcer devaluation tests 1 and 2) for the three groups (CON group, area 11/13 group and area 14 group) using a repeated-measures ANOVA (Fig. 3A). This analysis revealed a significant effect of test (F(1,9) = 7.11, p < 0.05) and a significant effect of group (F(2,9) = 5.04, p < 0.05). Post hoc tests revealed that the area 11/13 group obtained significantly lower difference scores relative to the CON group (p < 0.05). The deficit of the area 11/13 group was most prominent in reinforcer devaluation test 2, where their difference scores were significantly lower than those of both the CON group and the area 14 group (F(2,9) = 9.01, p < 0.01; post hoc tests; CON vs area 11/13, p < 0.01; area 11/13 vs area 14, p < 0.01; CON vs area 14, p > 0.05).
There was also a significant group by test interaction (F(2,9) = 5.383, p < 0.05) as monkeys in the area 14 group (p < 0.05), but not the CON or area 11/13 group (p > 0.4), significantly altered the degree to which they shifted their choices across the two tests (Fig. 3A). A shift in difference scores across the two tests is something that we have reported in previous groups of either control or lesion monkeys (Baxter et al., 2000; Izquierdo and Murray, 2004). One possible explanation for the change across tests is that inadvertent damage to areas 11/13 caused a transient effect. A Pearson correlation conducted on the extent of damage to area 11/13 and the difference score on devaluation test 1 in the monkeys with area 14 lesions, however, failed to prove that there was a relationship (r = 0.325, p > 0.5). An alternative explanation is that monkeys in this group had recovery of function during devaluation test 1. To test for recovery of function within the first devaluation test, we examined scores across five-trial blocks. A 6 (trial blocks) by 2 (lesion group) repeated-measures ANOVA failed to reveal a block by group interaction or main effect of group when area 14 lesion animals were directly compared to controls (block by group interaction, F(5,30) = 1.14, p > 0.3; effect of group, F(1,6) = 3.13, p > 0.1). Thus, there was no evidence for recovery of function in the monkeys with area 14 lesions during reinforcer devaluation test 1.
Although both operated groups ate slightly more on average than the CON group (mean ± SEM: CON = 120.4 ± 7.1 g; area 11/13 mean: 165.1 ± 22.9 g; area 14 mean: 150.5 ± 13.6 g), the amount of food the groups consumed in the selective satiation procedures conducted before reinforcer devaluation tests 1 and 2 did not differ (F(2,9) = 2.05, p > 0.1). The number of times per session following selective satiation that monkeys chose an object but did not retrieve the food underneath also did not differ between groups (mean ± SEM: CON = 1.25 ± 0.6; area 11/13 mean: 1.06 ± 0.98; area 14 mean: 1.56 ± 0.4). A repeated-measures ANOVA of the number of unretrieved foods across the two devaluation tests for the three groups did not reveal any significant main effects or interactions (effect of test F(1,9) = 3.05, p > 0.1; effect of group F(2,9) = 0.78, p > 0.5; test by group interaction F(2,9) = 2.66, p > 0.1).
Reinforcer devaluation test 3 (food only)
When monkeys were presented with the two food items in the absence of objects, all groups consistently chose the food on which they had not been sated (Fig. 3B). A one-way ANOVA of the difference scores in reinforcer devaluation test 3 confirmed this impression (F(2,9) = 2.81 p > 0.1). There were also no group differences in the amount of food consumed in the selective satiation procedure before reinforcer devaluation test 3 (F(2,9) = 1.31, p > 0.1).
Object reversal learning
All monkeys, regardless of group, readily acquired the object discrimination (Fig. 4, ACQ). Comparison of the number of errors to criterion for the acquisition phase did not reveal any significant differences between the groups (F(2,9) = 2.45, p > 0.1). When the reward contingencies were reversed, all groups were able to rapidly acquire the new reward contingencies (Fig. 4). A repeated-measures ANOVA of the mean errors to criterion across the nine reversals revealed a significant effect of reversal (F(8,72) = 10.14, p < 0.01), but no significant group by reversal interaction (F(16,72) = 1.24, p > 0.3), indicating that all three groups developed a reversal learning set (i.e., made fewer errors to criterion the more reversals that they completed). There was no main effect of group (F(2,9) = 0.39, p > 0.05), confirming the impression that both operated groups performed similarly as the CON group.
To test whether either of the two operated groups differed from the CON group in their ability to switch object choices or use either positive or negative feedback to guide subsequent choices, we conducted a more fine-grained analysis that has previously been used to probe the effects of complete PFo lesions made by aspiration (Rudebeck and Murray, 2008). To control for differences in the number of trials across monkeys, we included only the first 60 trials after each reversal. First, the total errors each monkey made across the nine reversals were divided into two types: (1) those errors made after a reversal but before a correct choice had been made; and (2) those errors made after a correct choice. This was done to assess whether, following a reversal, monkeys in either lesion group were slower to use negative feedback to switch to choosing the rewarded option, or if, following a correct choice after a reversal, monkeys in either group were slower to use this positive information to guide their choices. Examining the errors by this method, however, did not reveal any differences between the groups (Fig. 4B). Neither the area 11/13 group nor area 14 group made more of either type of error than the CON group (repeated measure ANOVA of error type: effect of error type, F(1,9) = 8.14, p < 0.05; error type by group interaction, F(2,9) = 0.64, p > 0.5; effect of group, F(2,9) = 0.32, p > 0.05).
Despite this finding, it could still be the case that monkeys with area 11/13 or area 14 lesions might be differentially influenced by positive or negative feedback compared to controls (Fig. 4C). To explore this possibility, we analyzed monkeys' choices on the trial following either positive feedback (when a choice was rewarded, Correct + 1) or negative feedback (when a choice was not rewarded, Error + 1). Although there was a main effect of feedback type (repeated-measures ANOVA on effect of feedback type: F(1,9) = 467.15, p < 0.01), no group differences or interactions emerged (feedback type by group interaction: F(2,9) = 1.12, p > 0.3; effect of group: F(2,9) = 0.985, p > 0.4). Further analysis of monkeys' choices following strings of correctly or incorrectly performed trials similarly did not reveal any group differences (p > 0.1). Thus, neither lesions of areas 11/13 nor lesions of area 14 had any discernable effect on object reversal learning.
Object preference task
Object discrimination learning
When presented with a novel set of 60 pairs of objects to discriminate, monkeys in all groups swiftly learned which ones were associated with food. Monkeys learned in a mean of 7.3 sessions (range, 4–11). Neither the number of errors to criterion (F(2,9) = 1.24, p > 0.3) nor the number of sessions to criterion (F(2,9) = 1.00, p > 0.4) revealed significant differences between the groups.
To test preference judgments independent of learning object–food associations, monkeys were presented with novel combinations of familiar, rewarded objects. Monkeys in the CON group made consistent object choices that we assume reflect their food preferences. By contrast, the area 14 group were less consistent in their choices of object types (Fig. 5). To quantify the degree to which the object choices were inconsistent with overall food preferences, we calculated a cumulative “preference disparity score” (Equation 2) for each monkey. This measure was based on a set of preference weights for each object type that allowed us to take into account the degree to which choices were inconsistent. The score was calculated in a series of steps. First, the number of times individual monkeys chose each of the object types across the four test sessions was tabulated. For example, across the four test sessions, one monkey chose 47 M&M's objects, 30 fruit snacks objects, 20 peanut objects, 16 raisin objects, and 7 banana pellet objects. Second, the tabulated scores were normalized to give a set of preference weights by subtracting the number of times the least chosen object type was selected and dividing by the total range of the number of choices for each of the five object types (Equation 2, preference weights: Wa–e). Taking the example above, choices would be normalized by subtracting 7 (banana pellet choices) and then dividing by 40 (the total range [47–7]) to yield a set of preference weights of 1.0, 0.575, 0.325, 0.225, and 0, respectively. Next, trials were identified where monkeys chose the object associated with the lower preference weight of the two objects presented. In each instance a preference disparity score was determined based upon the difference in preference weight (Wa − Wb), and this disparity score was multiplied by the frequency of choice for the object of lower value (Fb). For example, if across the course of the four test sessions the monkey chose an object associated with a raisin over one associated with an M&M's four times, the preference disparity score would be 0.775 (1.0 − 0.225), which would then give a cumulative score of 3.1 (4 × 0.775). Finally, the disparity scores were summed across all trial types to provide a cumulative preference disparity score that reflected the inconsistency of each monkey's choices: Comparison of these scores across the different groups revealed a trend for a main effect (one-way ANOVA, F(2,9) = 3.04, p = 0.09), and post hoc tests, although not strictly permissible, were performed to probe this result further. The choices of monkeys with lesions of area 14 were less consistent than those of the other groups; statistical tests again revealed marginally significant differences between the groups (area 14 vs CON, p = 0.053; area 14 vs 11/13 group, p = 0.074). Monkeys with area 11/13 lesions were no different than monkeys of the CON group (p > 0.5).
One monkey that had sustained an excitotoxic lesion of area 14 was unable to complete the extinction test because of nontest-related issues. As a result, data from only three subjects in this group were available for analysis. All monkeys readily acquired the association between the single novel object and the food reward, often attaining criterion within the minimum of five sessions. During the acquisition phase both operated groups behaved similarly as the CON group, apart from one monkey in the area 11/13 group who took considerably longer to reach the criterion (mean omissions ± SEM: CON = 0.25 ± 0.25, area 11/13 = 15 ± 14.34 and area 14 = 0.67 ± 0.29). An analysis comparing the number of omissions made by the monkeys in each group verified that there were no group differences (F(2,8) = 0.87, p > 0.45). Likewise, there were no group differences in the number of sessions taken to attain criterion (F(2,8) = 0.85, p > 0.46).
During the extinction phase the rate at which all monkeys displaced the object decreased across the five testing sessions (Fig. 6A). Monkeys with excitotoxic lesions of area 14 were slower to extinguish responding than the other two groups in the first extinction session (Fig. 6A). To examine this in more detail monkeys' choices in the first 30-trial extinction session were divided into five blocks of six trials each (Fig. 6B). An ANOVA revealed that the area 14 group was slower than the CON group to extinguish responding (F(1,4) = 6.25, p = 0.05), but the area 11/13 and CON groups did not differ from each other (F(1,5) = 0.05, p > 0.5).
This difference among groups did not persist in subsequent testing sessions. A repeated-measures analysis of the number of unrewarded object displacements made across the five extinction sessions found no significant main effect of group (F(2,8) = 0.52, p > 0.5), although there was a significant main effect of session (F(4,32) = 12.27, p < 0.001). This finding indicates that the monkeys extinguished their responding over the five sessions to the same degree, although it can be appreciated from Figure 6A that the area 14 group showed more responses and more variability overall.
The present study compared the effects of subtotal excitotoxic lesions within the PFo to provide a better understanding of how anatomically distinct subregions of this part of the prefrontal cortex contribute to reward-guided behavior. Monkeys with lesions of area 11/13, but not those with lesions of area 14, were severely impaired in their ability to make object choices based on the current value of the goal as assessed in the reinforcer devaluation task. By contrast, monkeys with lesions of area 14, but not those with lesions of area 11/13, exhibited mild slowing of the rate at which they extinguished responding to a previously rewarded object. Together, these findings reveal a double dissociation of function within PFo. Surprisingly, neither lesions of area 11/13 nor lesions of area 14 affected monkeys' abilities to respond flexibly in the object reversal learning task, although it is known that combined ablations made by aspiration do so.
The role of area 11 and 13 in object-value representations
With direct connections from amygdala as well as converging input from gustatory, olfactory, and visual cortex, area 11/13 is ideally situated to integrate the specific sensory qualities of reinforcement with visual and other sensory information (Carmichael and Price, 1995a,b). Accordingly, it has been proposed that this part of PFo is specialized for the assessment of sensory objects such as foods (Saleem et al., 2008). The inability of monkeys with lesions of area 11/13 (Fig. 3A) to either update the value of the food outcome following selective satiation or to use the updated value to make adaptive choices confirms this hypothesis. The magnitude of the deficit following excitotoxic 11/13 lesions is similar to that seen following aspiration lesions of the complete PFo (Izquierdo et al., 2004). These findings suggest that, within PFo, area 11/13 mediates the value-representation functions when choices involve objects. However, monkeys with lesions of area 11/13 were just as able as unoperated controls to adaptively choose foods when they were presented in the absence of objects (reinforcer devaluation test 3). This difference between tests 1 and 3 presumably occurs because monkeys are able to associate the visual properties of the food with the current biological value of the food during the 30 min selective satiation procedure. As no such experience of the objects was available, the monkeys could not update the value of the object representations.
Our results point to a key role for area 11/13 in permitting an object to elicit representations of the current value of expected outcomes, especially after those values have changed. One possibility is that the object–food mappings are stored in the PFo, and selective satiation leads to rapid updating of the value representations associated with the object or the mapping. These valuations are then used to guide object choices. Such a finding complements recent reports that PFo is important for learning the value of stimuli in settings where the relationship between stimuli and reinforcement is probabilistic and can rapidly change over time (Rudebeck et al., 2008; Walton et al., 2010). It also agrees with evidence from single-unit neurophysiological studies. Specifically, the activity of neurons within areas 11 and 13 is modulated by the value of rewards associated with a particular stimulus (Wallis and Miller, 2003; Padoa-Schioppa and Assad, 2008). A role for area 11/13 in rapid value updating is further underscored by the finding that monkeys in the 11/13 group were able to choose objects appropriately in the object preference task (Fig. 5), in which object–food value associations had been learned before the object choices.
A role for area 14 in extinction
Monkeys with aspiration lesions of the entire PFo (areas 11, 13, and 14) are slower than controls to extinguish responding to a previously rewarded object (Izquierdo and Murray, 2005). Although the deficit associated with excitotoxic lesions of area 14 reported here is smaller than that previously reported following aspiration lesions of the entire PFo, the present data suggest that the part of PFo critical for extinction learning is area 14. It could be argued that the present deficit might be due to inadvertent damage to area 25, also known as infralimbic cortex (IL). Analysis of the extent of the lesion indicated that area 25 was largely intact in monkeys with area 14 lesions. Furthermore, studies in rats that have implicated the IL in extinction learning (Rhodes and Killcross, 2004; Quirk et al., 2006) have predominantly shown that damage within the IL leads to spontaneous recovery within the second session after the relationship between stimulus and reward has been extinguished and not to alterations in extinction learning per se, as reported here.
The deficit in extinction learning following lesions of area 14 may be related to the role of PFo in visually guided rules. Interaction between PFo and inferotemporal cortex is important for implementing visually guided rules (Bussey et al., 2002; Browning et al., 2007). However, such an explanation would not be consistent with the finding that lesions of area 14 failed to disrupt object reversal learning. An alternative possibility is that the contribution of area 14 to extinction learning is related to the role of PFo in modulating visceromotor responses. The most medial parts of PFo have connections to the hypothalamus, periaqueductal gray, and amygdala and play a role in modulating autonomic responses (Kaada et al., 1949; Carmichael and Price, 1996). Lesions of the PFo lead to a dysregulation of autonomic responses. Specifically Reekie et al. (2008) found that blood pressure in monkeys with excitotoxic lesions of PFo was higher than that in controls after reward omission during one-trial extinction learning (Reekie et al., 2008). Thus, one might posit that the mild deficit in extinction learning observed after area 14 lesions is due to increased autonomic arousal following nonreceipt of reward, which in turn leads to a higher rate of object displacements during the first session of extinction. An important avenue for future research will be to determine the contribution of area 14 to the regulation of autonomic responses.
In the object preference task, monkeys were required to make object choices after object–value associations had been learned, but before the monkeys had any experience directly comparing the outcomes of object choices. Thus, the object preference test was meant to probe value judgments independent of learning. Neither area 14 nor area 11/13 lesions led to an impairment on this task, although there was a trend for monkeys with area 14 lesions to be different than both controls and monkeys with area 11/13 lesions (Fig. 5). This trend for monkeys with area 14 lesions to have unstable object preferences is in line with functional imaging and patient studies in humans that suggest that the key function of medial parts of PFo, including area 14 and 10m, may be to compare the value of different options during choice (Fellows and Farah, 2007; Boorman et al., 2009). Consistent with this idea, it has been reported that aspiration lesions of medial PFo, including area 14 but not area 11/13, disrupt the ability of monkeys to make independent value comparisons in a probabilistic three-armed bandit task (Noonan et al., 2010).
Reversal learning and functional specialization within PFo
That neither lesion affected monkeys performance on object reversal learning is surprising given the long history of the effects of PFo lesions on this task (Butter et al., 1963; Jones and Mishkin, 1972; Dias et al., 1996; Izquierdo et al., 2004; Fellows and Farah, 2005). Our results confirm the findings of Kazama and Bachevalier (2009), who reported that monkeys with lesions of areas 11 and 13 were unimpaired on object reversal learning. In addition, our results show that area 14, which has been heavily implicated as the part of the PFo that mediates object reversal learning (Kazama and Bachevalier, 2009), like area 11/13, is not essential for this task.
There are at least two explanations for our failure to observe a deficit in object reversal learning after removal of subregions of PFo. The first is that combined damage to areas 11, 13, and 14 is required to yield a deficit on this task. This explanation is consistent with the available data on the effects of aspiration lesions of PFo cited above and the effects of either excitotoxic or aspiration lesions of subregions of PFo (Kazama and Bachevalier, 2009; present study). A second possibility is that inadvertent damage to fibers of passage, either alone or together with damage to PFo, is responsible for the deficit. Determining the effects on object reversal learning of fiber-sparing, excitotoxic lesions of the entire PFo, including areas 11, 13, and 14, will be required to test this possibility.
In summary, our results show that areas 11/13 and 14 make dissociable contributions to reward-guided behavior. Despite the fact that some functions are entirely dependent on subregions of PFo (e.g., rapid value updating), it may also be the case that one subregion of PFo is able to compensate for damage to the other through interactions with other parts of the brain that play a role in reward-guided behavior, including the medial frontal cortex, amygdala, and striatum (Rushworth et al., 2007; Rangel et al., 2008). This explanation underscores the importance of considering the PFo as comprised of anatomically and functionally distinct subregions.
This work was supported by the Intramural Research Program of the National Institute of Mental Health. We thank Dawn Lundgren, Emily Howland, and Anna Prescott for assistance with data collection, Richard Saunders and Rachel Reoli for help performing surgery, and Steve Wise and Janine Simmons for useful comments on the manuscript. We also thank the staff of the Nuclear Magnetic Resonance Facility, National Institute of Neurological Disorder and Stroke, and the Laboratory of Diagnostic and Radiology Research, as well as Megan Malloy for assistance in obtaining the MR scans.
The authors declare no competing financial interests.
- Correspondence should be addressed to Dr. Peter H. Rudebeck, Laboratory of Neuropsychology, National Institute of Mental Health, Building 49, Suite 1B80, 49 Convent Drive, Bethesda, MD 20892-4415.