Abstract
Subregions of prefrontal cortex are important for estimating reward values and using these values to guide behavior. The present studies directly tested whether orbital prefrontal cortex (O-PFC) and lateral prefrontal cortex (L-PFC) are necessary for evaluating trial-to-trial changes in the reward values predicted by visual cues. We have compared intact rhesus monkeys, those with bilateral O-PFC lesions (n = 3), and those with bilateral L-PFC lesions (n = 3). We used three versions of a visually cued color discrimination task: we varied reward size, delay to reward, or both. O-PFC lesions altered estimations of predicted reward value in all versions of the task. L-PFC lesions disrupted performance only when both reward size and delay to reward were varied together. Neither lesion directly affected basic internal drive states (satiation curves). Our results suggest that O-PFC is essential for establishing independent, context-specific scales with which predicted reward values are measured. L-PFC appears necessary for integration of predicted reward value across these different scales.
Introduction
Several lines of investigation suggest subregions of prefrontal cortex are important for estimating reward values and using these values to guide behavior. Lesion, electrophysiology, and functional imaging data show that orbital prefrontal cortex (O-PFC) is engaged in reward processing (Kringelbach and Rolls, 2004; O'Doherty, 2007; Wallis, 2007; Murray and Wise, 2010). Lesions of O-PFC impair animals' abilities to form and update stimulus–reinforcer associations (Jones and Mishkin, 1972; Gallagher et al., 1999; Izquierdo et al., 2004; Rudebeck et al., 2008; Walton et al., 2010). O-PFC neurons respond differentially to cues predicting outcomes with different values (Thorpe et al., 1983; Hikosaka and Watanabe, 2000; Tremblay and Schultz, 2000; Wallis and Miller, 2003; Padoa-Schioppa and Assad, 2006; Simmons and Richmond, 2008). Factors such as reward size, payoff probability, and effort modulate representations of stimulus-associated reward value in O-PFC (Tremblay and Schultz, 1999; Schoenbaum and Setlow, 2001; Elliott et al., 2003; Wallis and Miller, 2003; Roesch and Olson, 2004; Kennerley et al., 2009; Bouret and Richmond, 2010).
Direct evidence that O-PFC is required for computing reward value is more limited and difficult to interpret. Lesions encompassing orbital and ventromedial prefrontal cortex disrupt human performance on the multifaceted Iowa gambling task. The impairments may result from difficulty calculating reward size and probability, reversing previously established associations, or integrating that information across time and decks of cards (Bechara et al., 1994; Fellows, 2007). In rats, O-PFC lesions have disrupted the rate of temporal discounting. However, in different reports, these rates changed in opposite directions, leaving open the question of the role of O-PFC in temporal discounting (Kheramin et al., 2002; Mobini et al., 2002; Winstanley et al., 2004).
Lateral prefrontal cortex (L-PFC) is more frequently associated with the integration of information over time and space than with the computation of reward value (Goldman-Rakic, 1995; Leon and Shadlen, 1999; Hikosaka and Watanabe, 2000; Fuster, 2001; Miller and Cohen, 2001; Roesch and Olson, 2004). It has been suggested that neurons in L-PFC integrate and use reward value information to guide decision making and action (Kobayashi et al., 2002; Roesch and Olson, 2003; Wallis and Miller, 2003; Kennerley and Wallis, 2009a). However, this hypothesis has not yet been explored fully (Kable and Glimcher, 2009; Wallis and Kennerley, 2010).
Here, we directly tested whether O-PFC and L-PFC were necessary for evaluating trial-to-trial changes in predicted reward values. We compared monkeys with selective, bilateral ablations of either O-PFC or L-PFC to unoperated controls, using a task in which visual cues predicted reward size and delay to reward, alone or in combination (Minamimoto et al., 2009). By using a nonchoice paradigm, we sought to distinguish representations of reward value from downstream decision-making processes (Rangel et al., 2008).
We hypothesized that O-PFC lesions would affect the ability of animals to scale reward value as a function of reward size and delay, individually and in combination. Because our paradigm does not require working memory and because the action performed is constant across trials, we did not expect L-PFC lesions to alter performance. Rather, the monkeys with L-PFC lesions were to serve as a priori operated controls.
Materials and Methods
Subjects.
Subjects were 14 male rhesus monkeys, weighing between 5 and 11.5 kg at the time of behavioral testing. Eight monkeys served as historical, unoperated controls [CON results previously reported by Minamimoto et al. (2009)]. The experimental groups (L-PFC and O-PFC) contained three monkeys each. All monkeys were fed a diet of primate chow (catalog #5038; PMI Feeds) supplemented with fresh fruit and vegetables. Access to water was controlled to ensure adequate motivation to perform the behavioral tasks while maintaining a healthy body weight. All experimental procedures were performed in accordance with the Guide for the Care and Use of Laboratory Animals and were approved by the Animal Care and Use Committee of the National Institute of Mental Health.
Surgery.
Monkeys received bilateral lesions of orbital or lateral prefrontal cortex using a combination of suction and electrocautery. All surgeries took place in a veterinary operating facility using aseptic procedures. Anesthesia was induced with ketamine hydrochloride (10 mg/kg, i.m.) and maintained using isoflurane gas (1.0–3.0% to effect). Vital signs were monitored throughout.
The intended L-PFC lesion (Fig. 1A) extended laterally from the dorsal midline to the orbital surface of the inferior convexity. The rostral limit of the lesion was the frontal pole. The caudal limit was the caudal extent of the principal sulcus. The frontal eye fields and the banks of the arcuate sulci were intentionally spared. In total, the intended lateral prefrontal lesion included areas 9, 45, 46, 12, and dorsal area 10 (Walker, 1940).
The intended O-PFC lesion (Fig. 2A) extended from the fundus of the lateral orbital sulcus to the fundus of the rostral sulcus. The rostral limit of the lesion was a line joining the anterior tips of the lateral and medial orbital sulci. The caudal limit was ∼5 mm rostral to the junction of the frontal and temporal lobes. In total, the intended orbital prefrontal lesion included areas 11, 13, 14, and the caudal part of ventral area 10 (Walker, 1940).
The lateral and orbital prefrontal lesions shared a common boundary at the lateral orbital sulcus, but did not overlap.
Assessment of the lesions.
Lesion size and location were assessed from T1-weighted magnetic resonance image (MRI) scans. MRIs were obtained on a 1.5 T General Electric Signa unit, using a 5 inch surface coil and a three-dimensional volume spoiled grass pulse sequence. Slices were taken every 1 mm, with an in-plane resolution of 0.4 mm. Coronal slices from the MR images were matched to coronal plates in a stereotaxic rhesus monkey brain atlas. The extent of the visible lesion was then plotted on each plate and subsequently reconstructed on the ventral or lateral views.
Behavioral testing.
For all behavioral training and testing, each monkey sat in a primate chair inside a sound-attenuated dark room. Visual stimuli were presented on a computer video monitor in front of the monkey. Behavioral control and data acquisition were performed using the REX program (Hays et al., 1982). Neurobehavioral Systems Presentation software was used to display visual stimuli (Neurobehavioral Systems).
The basic task consisted of a series of color discrimination trials (see Fig. 3A). Each trial was initiated when the monkey touched a bar mounted at the front of the chair. To perform a trial correctly, the monkey was required to release the bar between 200 and 1000 ms after a red spot (wait signal) turned green (go signal). On correctly performed trials, the spot then turned blue (correct signal), and a liquid reward was delivered.
A visual cue was presented at the beginning of each color discrimination trial (500 ms before the red spot appearing) (see Fig. 3A). In all cases, the visual cue provided valid information about the predicted trial outcome. This information was available on a trial-by-trial basis, but monkeys were not specifically trained or required to use cue-related information to perform the task. Every correctly performed color discrimination trial was rewarded. After an error (a mistimed bar release), the monkey had to repeat and correctly complete the same trial type to receive a reward.
The specific task used in these experiments was identical with that used by Minamimoto et al. (2009). The task was designed to evaluate performance associated with (1) reward size, (2) delay to reward, and (3) the combination of size and delay. The reward size version also allowed us to estimate the effects of internal drive state (satiation).
In reward size sessions, reward size varied on a trial-to-trial basis; the delay to reward was immediate and fixed across trials. We used four different reward sizes and four corresponding visual cues (see Fig. 3B). Correct trials were rewarded with 1, 2, 4, or 8 drops of water (1 drop = ∼0.1 ml). Each reward size was uniquely paired with a visual cue. Within a session, trials with each reward size were selected with equal probability and presented in a random order. Four monkeys from the unoperated CON group were tested in reward size sessions.
In delay-to-reward sessions, the timing of the reward delivery varied on a trial-to-trial basis; reward size was fixed across trials (see Fig. 3C). After a correctly timed bar release, the reward (∼0.5 ml) was delivered immediately (0.3 ± 0.1 s), after a moderate delay (3.6 ± 0.6 s), or after a long delay (7.2 ± 1.2 s). Each delay was uniquely paired with a visual cue. Within a session, trials with each of the three different delays were chosen with equal probability and presented in a random order. Eight monkeys from the unoperated CON group were tested in delay-to-reward sessions.
In the reward-size-and-delay sessions, both reward size and delay to reward varied on a trial-to-trial basis (see Fig. 3D). Two sizes (1 or 4 drops; 1 drop = ∼0.25 ml) and four delays (0.3 ± 0.1, 3.6 ± 0.6, 7.2 ± 1.2, or 10.8 ± 1.8 s) were used in a crossed design. The eight resulting trial types were each paired with a different visual cue. The visual cue presented at the beginning of each trial indicated the size of the reward to be provided and the time interval after which it would be delivered if the trial were performed correctly. Trials with each of the eight possible reward size and delay combinations were chosen with equal probability and presented in a random order. Five monkeys from the unoperated CON group were tested in the reward-size-and-delay sessions.
Before entering this experiment, all monkeys had been previously trained to perform color discrimination trials. All had been tested on multiple different types of visually cued color discrimination tasks and were very experienced with the introduction of new visual cues.
Monkeys were tested 5 d per week, for 10–20 testing sessions per task condition. Each session continued until the monkey would no longer initiate a new trial. For each condition, the error rate patterns observed with the full data sets were clearly apparent within the first three sessions. Additional sessions were run for more statistical power. The number of sessions per task condition did not differ across groups.
All monkeys were tested first with delay-to-reward sessions. This order was established so that monkeys would not refuse to work with delays after experiencing sessions in which rewards could be obtained immediately on all trials. Monkeys in the L-PFC and O-PFC groups next completed size-and-delay sessions, followed by reward size sessions.
Not all monkeys in the control group were tested in every task condition. Two monkeys ran the three conditions in same set order as the L-PFC and O-PFC groups. Two monkeys were tested in a different order: (1) delay, (2) size, (3) size and delay. One monkey performed only the first two tasks, with delay to reward followed by size and delay. Three monkeys performed only delay to reward. Neither the order nor the number of task conditions significantly altered the behavior of monkeys in the CON group.
Data analysis.
All data and statistical analyses were performed using the R statistical computing environment (Team, 2004). For each task, the error rate for each trial type was calculated by dividing the total number of errors by the total number of trials of that trial type over all sessions. Errors consisted of any anticipatory bar release (<200 ms after the appearance of the go signal) or any failure to release the bar within 1000 ms of the appearance of the go signal. Error rates and reaction times are both dependent on reward size and delay to reward. Because error rates in this task can be used to formulate a model for the joint effect of size and delay, we report error rates as our dependent measure (Minamimoto et al., 2009).
The experimental design consisted of a split-plot (mixed, within-subjects) model, with individual monkeys nested within lesion groups. (Venables and Ripley, 2002). For the reward size data, we used a two-way ANOVA to measure error rate as a function of reward size and experimental group. Similarly, for the delay-to-reward data, we measured error rate as a function of delay and experimental group. For the reward-size-and-delay data, we used a three-way ANOVA to test the effects of reward size, delay to reward, and experimental group on error rates. Because baseline error rates can vary from individual to individual, the variability accounted for by individual monkeys was included as an error term within each ANOVA (Zar, 1999). This standard statistical procedure allowed us to focus on behavioral patterns across groups, rather than on baseline differences in error rates intrinsic to each individual monkey.
We previously reported orthogonal main effects of reward size, delay to reward, and error rates in the intact animals included here as the CON group (Minamimoto et al., 2009). If the lesions caused changes in these behavioral patterns, we expected to see statistically significant main effects of treatment (the lesion) and/or significant interaction terms.
For reward size data, we further describe group differences by trying to fit the data obtained from each monkey with the functions previously found to fit CON data (Minamimoto et al., 2009). The first inverse function used was E = 1/aR, where R is the reward size, a is a constant parameter, and E is the error rate of the monkeys in trials with reward size R. This version of the task also allowed us to measure the effects of satiation on predicted reward value. To describe the effects of satiation, we divided the reward size sessions into quartiles based on accumulated reward. We first fit the data with the sigmoid function: R′ = R·f(S) =
Results
We tested groups CON, O-PFC, and L-PFC on a task designed to assess the monkeys' estimations of predicted reward value. In each trial, the monkey had to release a bar when a red spot turned green to obtain a reward. The reward contingencies were changed from trial to trial. A visual cue presented at the beginning of each trial indicated the reward size, the delay to reward, or a combination of the two (Fig. 3A). There were three versions of the task: (1) a version in which visual cues indicated reward size independent of delay (Fig. 3B), (2) a version in which cues indicated delay to reward independent of size (Fig. 3C), and (3) a version in which cues indicated both size and delay in a full crossed design (Fig. 3D).
Because error rates in this task can be used to formulate a model for the joint effect of size and delay, we report error rates as a behavioral correlate of the predicted reward value (Minamimoto et al., 2009). Errors consisted of either anticipatory bar release or nonrelease of the bar. There were no statistically significant effects of treatment group on the proportions of the two error types in any task condition. In the delay-to-reward condition, we found a main effect of delay in all groups: as the delay duration lengthened, the proportion of late errors increased significantly (one-way ANOVA, p < 0.001). There was no main effect of reward size on error type. There were no statistically significant interactions between group and size/delay on the proportions of error types.
Error rate differences can arise from idiosyncratic differences in individual behavior. Over many years of working with this type of task, we have noted that absolute error rates can vary considerably across monkeys, even without lesions [La Camera and Richmond (2008), their Fig. 2D].
Here, the clearest example of idiosyncratic behavior is seen in monkey O-PFC-SU. O-PFC-SU had higher error rates across all versions of the task and all outcome contingencies. These higher error rates were observed even when O-PFC-SU was tested on an uncued baseline task in which every trial was rewarded.
To address idiosyncratic differences, we have used split-plot (mixed, within-subject) ANOVAs to model our data. These ANOVAs included an error term that factored out the variability accounted for by individual monkeys such as O-PFC-SU (Zar, 1999). Therefore, although data from individual monkeys are presented, our statistical results do not reflect within-group differences in absolute error rates. Rather, they quantify differences in the pattern of error rates across groups and reward contingencies.
Assessment of the lesions
Lesion location and extent were mostly as intended within each experimental group. In the L-PFC group, all three lesions extended from 23 to 43 mm rostral to the interaural plane (Fig. 1B). All L-PFC lesions included regions 9, 45, 46, and 10, as intended. However, most of 12o (on the orbital surface) and the caudal part of 12l (on the ventrolateral surface) were spared in all three animals. The banks of the arcuate sulcus were spared, as intended. There were no areas of unintended damage.
In the O-PFC group (Fig. 2B), two of the three lesions (O-PFC-JE and O-PFC-SU) extended from 25 to 42 mm rostral to the interaural line; the other was placed slightly more caudally, extending from 23 to 41 mm rostral to the interaural line (O-PFC-BA). Except for a narrow strip of area 14 immediately ventral to the rostral sulcus and a small part of area 13l immediately medial to the lateral orbital sulcus, O-PFC lesions in all three monkeys included all intended regions. There were no areas of unintended damage. O-PFC-SU's higher error rates could not be accounted for by differences in the extent or location of the lesions.
Reward size
Error rates decreased as reward size increased in all three groups of monkeys. However, O-PFC lesions altered monkeys' sensitivity to changes in reward size (Fig. 4). A two-way ANOVA with between-subjects factor of treatment (CON, O-PFC, and L-PFC) and within-subjects factor of reward size (1, 2, 4, 8 drops) showed a main effect of reward size and a significant interaction between reward size and treatment. There was no main effect of treatment (two-way ANOVA with repeated measures; size: df = 3, F = 196, p < 0.001; treatment: df = 2, F = 0.1, p = 0.9; size by treatment: df = 6, F = 7.7, p < 0.001).
As can be inferred from inspection of Figure 4, pairwise comparisons showed that the interaction between reward size and treatment arose only from a difference between the CON and O-PFC groups. Analysis of CON and L-PFC groups showed a main effect of reward size, with no effect of treatment or interaction between size and treatment (two-way ANOVA with repeated measures; size: df = 3, F = 213.8, p < 0.001; treatment: df = 1, F = 0.04, p = 0.85; size by treatment: df = 3, F = 0.72, p = 0.54). Comparison of CON and O-PFC groups showed a main effect of reward size and an interaction between size and treatment, with no effect of treatment (two-way ANOVA with repeated measures; size: df = 3, F = 99.7, p < 0.001; treatment: df = 1, F = 0.08, p = 0.8; size by treatment: df = 3, F = 11.1, p < 0.001).
We also fit the data obtained from each monkey with the inverse function previously found to fit CON data (E = 1/aR) [control data as previously reported by Minamimoto et al. (2009)]. This function fit the L-PFC data well without modification (r2 > 0.9) (Fig. 4B), but not the O-PFC data (r2 < 0.8). Although an inverse relation between error rate and predicted reward size remained, we observed reduced contrast between the minimum and maximum error rates in the O-PFC reward size curves (Fig. 4C). Specifically, O-PFC lesions reduced the difference between error rates on 1- versus 2-drop trials. To fit the O-PFC data, we introduced a second free parameter such that E = 1/a(R + b), where the new parameter b quantified the shift of each curve (from 1.3 to 6.9 drops). This quantifies the decrease in sensitivity to changes in reward size on the behavior. In all cases, O-PFC lesions reduced monkeys' sensitivity to differences in the predicted outcome, especially at low reward sizes.
Delay to reward
Monkeys in the CON group increased their error rates linearly as a function of the predicted delay to reward (Fig. 5A) (Minamimoto et al., 2009). Like CON monkeys, L-PFC monkeys showed linearly increasing error rates with increases in predicted delay (Fig. 5B). Although error rates also increased with increasing delay in O-PFC monkeys, each monkey showed a different pattern of error rates across delays (Fig. 5C). Monkey O-PFC-BA's error rates increased linearly with delay, in a manner indistinguishable from controls. O-PFC-JE's error rates did not increase until the delay was >3.6 s. O-PFC-SU's error rates increased sharply between 0.3 and 3.6 s delays. These individual differences could not be accounted for by differences in the extent or location of the lesions.
A two-way ANOVA with between-subjects factor of treatment (CON, O-PFC, and L-PFC) and within-subjects factor of delay to reward (0.3 ± 0.1, 3.6 ± 0.6, or 7.2 ± 1.2 s) showed a main effect of delay, and an interaction between delay and treatment; there was no main effect of treatment alone (two-way ANOVA with repeated measures; delay: df = 2, F = 374.9, p < 0.001; treatment: df = 2, F = 0.64, p = 0.5; delay by treatment: df = 4, F = 2.77, p = 0.03). Pairwise comparisons between groups revealed that the interaction between delay and treatment arose from a difference between the CON and O-PFC groups only. Comparison of CON and L-PFC groups showed a main effect of delay, without effects of treatment alone or interaction between delay and treatment (two-way ANOVA with repeated measures; delay: df = 2, F = 276, p < 0.001; treatment: df = 1, F = 0.002, p = 0.96; delay by treatment: df = 2, F = 1.7, p = 0.19). Comparison of CON and O-PFC groups showed a main effect of delay and an interaction between delay and treatment, but no effect of treatment alone (two-way ANOVA with repeated measures; delay: df = 2, F = 309.4, p < 0.001; treatment: df = 1, F = 1.15, p = 0.31; delay by treatment: df = 2, F = 5, p < 0.01). Thus, O-PFC lesions also altered monkeys' ability to estimate predicted reward value as a function of delay to reward.
Reward size and delay
When both reward size and delay to reward were varied in a crossed design, error rates in the CON group depended on both variables in a systematic fashion (Fig. 6A). For each reward size, error rates in the CON group increased linearly with predicted delay. The rate of this increase depended on the predicted reward size: error rates increased more rapidly for smaller rewards (Minamimoto et al., 2009).
The performance of L-PFC monkeys had been indistinguishable from that of CONs when reward size and delay to reward were varied in separate sessions (Figs. 4, 5). However, L-PFC monkeys showed an abnormal error rate pattern in the crossed design (Fig. 6C). They now showed effects similar to those seen in the O-PFC monkeys: their error rates did increase with increasing delays, but the L-PFC monkeys were only able to distinguish between the predicted reward sizes when delays were longer than 3.6 s.
As expected from the results above, O-PFC monkeys displayed abnormal error rate patterns across the range of predicted reward sizes and delays (Fig. 6D). These monkeys failed to distinguish between the predicted reward sizes at delays up to 3.6 s. In this crossed design, O-PFC monkeys also showed an additional abnormal trend. Whereas normal monkeys scale value linearly with predicted delays (Fig. 5), error rates for the O-PFC monkeys appeared to asymptote at longer delays, especially for the larger reward size (4 drops) (Fig. 6).
A three-way ANOVA with between-subjects factor of treatment (CON, O-PFC, and L-PFC) and within-subjects factors of reward size and delay to reward showed main effects of size and delay, but no effect of treatment alone (three-way ANOVA with repeated measures; delay: df = 3, F = 282.7, p < 0.001; size: df = 1, F = 265.1, p < 0.001; treatment: df = 2, F = 1.3, p = 0.3). The behavioral changes seen in the L-PFC and O-PFC groups were captured by the interaction terms. Significant interactions were found between treatment and delay, treatment and reward size, and between delay and reward size (three-way ANOVA with repeated measures; treatment by delay: df = 6, F = 9.8, p < 0.001; treatment by size: df = 2, F = 4.5, p = 0.01; delay by reward: df = 3, F = 79.5, p < 0.001). The overall three-way interaction between treatment, delay, and reward size was also significant (three-way ANOVA with repeated measures; treatment by delay by reward size: df = 6, F = 2.7, p = 0.01).
Pairwise comparison between the CON and L-PFC groups showed main effects of delay and size, without an effect of treatment alone (three-way ANOVA with repeated measures; delay: df = 3, F = 342.4, p < 0.001; size: df = 1, F = 374.9, p < 0.001; treatment: df = 1, F = 3.9, p = 0.09). There were also two-way interactions between treatment and delay and delay and reward size, but not between treatment and reward size (three-way ANOVA with repeated measures; treatment by delay: df = 3, F = 24.9, p < 0.001; delay by reward: df = 3, F = 89.5, p < 0.001; treatment by size: df = 1, F = 0.3, p = 0.86). The three-way interaction between treatment, delay, and reward size was significant (three-way ANOVA with repeated measures; treatment by delay by reward size: df = 3, F = 7.3, p < 0.001).
Comparison between the CON and O-PFC groups showed main effects of delay and reward size, without an effect of treatment alone (three-way ANOVA with repeated measures; delay: df = 3, F = 127.2, p < 0.001; size: df = 1, F = 146.7, p < 0.001; treatment: df = 1, F = 1.6, p = 0.25). All three two-way interactions were also significant (three-way ANOVA with repeated measures; treatment by delay: df = 3, F = 2.6, p = 0.05; treatment by size: df = 1, F = 6.05, p = 0.01; delay by reward: df = 3, F = 40.3, p < 0.001). However, the three-way interaction between treatment, delay, and reward size was indistinguishable from chance levels (three-way ANOVA with repeated measures; treatment by delay by reward size: df = 3, F = 1.77, p = 0.15).
Satiation
In the CON group, error rates in the reward size task depended on the monkeys' satiation level (Fig. 7A) (Minamimoto et al., 2009). Specifically, error rates increased monotonically as each session progressed and as the monkeys moved from thirst to satiation. Moreover, the effect of satiation on error rate was consistent across reward sizes. We hypothesized that both O-PFC and L-PFC lesions would leave basic satiation processes intact. Based on reinforcer devaluation studies in which O-PFC lesions affect monkeys' choices after selective satiation, we predicted that O-PFC lesions might disrupt the interactions of satiation with the assessments of reward value (Gallagher et al., 1999; Pickens et al., 2003; Izquierdo et al., 2004; Machado and Bachevalier, 2007).
Error rates did increase with satiation in monkeys with L-PFC and O-PFC lesions. In addition to the already reported main effect of reward size, a three-way ANOVA with repeated measures revealed a main effect of session quartile, (size: df = 3, F = 308.1, p < 0.001; quartile: df = 3, F = 347.7, p < 0.001). That is, error rates increased monotonically with satiation in all three groups. No main effect of treatment was observed (treatment: df = 2, F = 0.27, p = 0.77). Post hoc pairwise comparisons between the CON, O-PFC, and L-PFC groups also showed no main effect of either lesion on satiation, with main effects of reward size and quartile remaining as expected (three-way ANOVA with repeated measures; L-PFC treatment: df = 1, F = 0.26, p = 0.63; L-PFC size: df = 3, F = 339.4, p < 0.001; L-PFC quartile: df = 3, F = 315.6, p < 0.001; O-PFC treatment: df = 1, F = 0.38, p = 0.57; O-PFC size: df = 3, F = 154.7, p < 0.001; O-PFC quartile: df = 3, F = 186.2, p < 0.001).
As expected, how satiation affected behavior varied across the groups. All two- and three-way interactions were significant (three-way ANOVA with repeated measures; treatment by quartile: df = 6, F = 13.16, p < 0.001; treatment by size: df = 6, F = 9.36, p < 0.001; quartile by size: df = 9, F = 36.16, p < 0.001; treatment by quartile by size: df = 18, F = 2.83, p < 0.001). Post hoc pairwise comparisons between the CON, O-PFC, and L-PFC groups also revealed significant interactions between these factors (three-way ANOVA with repeated measures; L-PFC treatment by quartile: df = 3, F = 10.1, p < 0.001; L-PFC treatment by size: df = 3, F = 4.7, p = 0.003; L-PFC quartile by size: df = 9, F = 39.8, p < 0.001; L-PFC treatment by quartile by size: df = 9, F = 2.2, p = 0.02; O-PFC treatment by quartile: df = 3, F = 15.8, p < 0.001; O-PFC treatment by size: df = 3, F = 4.9, p = 0.002; O-PFC quartile by size: df = 9, F = 23.0, p < 0.001; O-PFC treatment by quartile by size: df = 9, F = 4.6, p < 0.001). These significant interactions indicate that each lesion type altered the shape of the satiation curve.
In the CON monkeys, predicted reward value could be mathematically modeled by simply multiplying a function of satiation by reward size:
Discussion
Bilateral ablations of O-PFC and L-PFC disrupted performance on a task in which visual cues predicted reward size and delay to reward. O-PFC lesions altered monkeys' estimations of predicted reward value when reward size or delay to reward were varied either individually or in combination. When reward size effects were taken into account, O-PFC lesions had no significant effect on satiation. In contrast, L-PFC lesions did not alter value estimations based on predicted reward size or delay alone. Nor did L-PFC lesions alter the effects of satiation. L-PFC lesions only altered the monkeys' estimations of predicted reward value when size and delay were varied together. These distinctions suggest fundamentally different roles for these two prefrontal cortical regions.
Orbital prefrontal cortex
A consistent body of literature has indicated that O-PFC is involved when visual cues are used to predict reinforcing outcomes. Previous O-PFC lesion studies have revealed deficits in stimulus–reinforcer reversals, stimulus–reinforcer updating based on food preferences, and the acquisition of new responses for conditioned reinforcement (Jones and Mishkin, 1972; Dias et al., 1996; Gallagher et al., 1999; Pears et al., 2003; Izquierdo et al., 2004; Machado and Bachevalier, 2007). O-PFC lesions also impair the ability of monkeys to learn stimulus–reinforcer contingencies when those contingencies vary stochastically (Rudebeck et al., 2008).
The present study combined selective bilateral ablations of O-PFC with a task design that allowed us to directly measure monkeys' relative value estimations, without requiring them to make choices. Our results extend previous findings by showing that O-PFC is required for behavioral adjustments as trial-to-trial changes in predicted reward size occur. Moreover, although not unexpected, the delay-to-reward results are a demonstration that monkeys require an intact O-PFC for temporal discounting. Our results complement findings from physiological studies in which neuronal activity in O-PFC was found to reflect reward value as a function of reward size, preference, and effort (Tremblay and Schultz, 1999; Hikosaka and Watanabe, 2000; Wallis and Miller, 2003; Roesch and Olson, 2004; Padoa-Schioppa and Assad, 2006; Kennerley et al., 2009; Bouret and Richmond, 2010). Together, the data suggest that O-PFC is necessary for the estimation of predicted reward value when any individual reward dimension is varied.
O-PFC ablations did not completely eliminate monkeys' abilities to use cues to predict reward size, temporally discount rewards, or the combination. Rather, the lesions changed how monkeys scaled relative differences in predicted reward value. Specifically, O-PFC lesions blunted monkeys' appreciation of the range of predicted values. Recent electrophysiological studies have shown that many orbitofrontal neurons adapt to the range of available reward sizes. That is, the neuronal responses in O-PFC rescale to reflect the contrast between the largest (or most preferred) reward and the smallest (or least preferred) available in the task (Padoa-Schioppa, 2009; Kobayashi et al., 2010). With our data, these findings suggest that an intact O-PFC is not necessary for recognizing that different rewards have different intrinsic values; rather, O-PFC is essential for using environmental cues to scale and rescale reward value.
The view that O-PFC is required for constructing scales by which to compare the relative values of predicted rewards may also explain findings from studies using choice paradigms. In reinforcer devaluation experiments, selective satiation abruptly alters monkeys' food preferences. Control monkeys take this change into account and shift their choices away from objects predicting devalued foods. Bilateral O-PFC lesions significantly attenuate this shift in object choices (Izquierdo et al., 2004; Machado and Bachevalier, 2007). This deficit has been interpreted as an inability to update value representations stored in O-PFC (Izquierdo et al., 2004; Murray and Wise, 2010); it might equally be considered an inability to rescale value. An alternative possibility is that the reinforcer devaluation task and the visually cued reward task measure different aspects of reward value (Parkinson et al., 2005).
Scaling deficits might be more marked when the contrast provided by the range of available rewards is small. Evidence supporting this idea has emerged from this study: in our hands, monkeys with O-PFC lesions seemed to have the greatest problems with small, closely spaced reward sizes and shorter delays (Figs. 4, 5). In another recent experiment, when reward probabilities were fixed, monkeys with O-PFC lesions had much greater difficulty differentiating among probabilities with the narrowest range of relative values (Walton et al., 2010). Our findings leave open the possibility that, under conditions in which predicted reward probabilities change from trial to trial, O-PFC facilitates the formation of precise associations between particular visual stimuli and particular outcomes (Walton et al., 2010).
Lateral prefrontal cortex
When predicted reward size or delay to reward was tested separately, the ability of monkeys with L-PFC lesions to estimate reward value appeared unaffected. However, to our surprise, these monkeys were significantly impaired when size and delay were varied together, in a full crossed design.
One explanation might be that bilateral L-PFC lesions limit the monkeys' overall cognitive capacity, allowing them to estimate values predicted by three or four different cues, but not by eight. However, these same monkeys have successfully performed a categorization task involving >20 cues. In that study, the pattern of error rates after L-PFC ablation was unchanged from prelesion baseline testing (Minamimoto et al., 2010). Therefore, task complexity or cognitive load seems unlikely to account for the deficits we report.
Our results might initially appear at odds with electrophysiological results showing a smaller proportion of neurons in L-PFC than O-PFC encoding value across multiple dimensions (Kennerley and Wallis, 2009b; Kennerley et al., 2009). Among other experimental differences, however, Kennerley et al. examined the main effects of each reward dimension independently. Our study tested the effects of L-PFC lesions on value estimations when the two reward dimensions varied together on each trial. Deficits emerged specifically when monkeys were required to combine information about predicted reward value across size and time.
L-PFC has long been associated with the ability to hold information on-line, to perform extradimensional shifts, and/or to integrate attributes across domains (Goldman-Rakic, 1995; Dias et al., 1996; Owen et al., 1998; Fuster, 2001; Miller and Cohen, 2001; Curtis and D'Esposito, 2004). Therefore, we suggest that the deficits observed here reflect the critical role of lateral prefrontal cortex in integration of reward value information across multiple dimensions.
Conclusions
These results fit with a body of work distinguishing the roles of O-PFC and L-PFC. O-PFC is known to play a critical role in estimating predicted reward values; L-PFC may use reward value information to guide decision making (Wallis and Kennerley, 2010). We suggest that O-PFC is essential specifically for establishing the context-specific scales used to measure the relative values of predicted outcomes. L-PFC appears necessary for integration of predicted reward values across different scales.
Footnotes
This work was supported by the Division of Intramural Research Programs at the National Institute of Mental Health. We thank Richard Saunders, Megan Malloy, Ping-Yu Chen, Dawn Lundgren-Anuszkiewicz, James Peck, and Sebastien Bouret.
- Correspondence should be addressed to Barry J. Richmond, Laboratory of Neuropsychology, National Institute of Mental Health, National Institutes of Health, Department of Health and Human Services, Building 49, Room 1B80, Bethesda, MD 20892-4415. bjr{at}ln.nimh.nih.gov