Every day we evaluate different courses of action, based on the rewards and costs associated with each one, and then choose between them. The neurobiological basis of this powerful and flexible behavior is poorly understood.
Theoretical analyses of decision-making highlight the need for the values of each alternative action to be mapped onto a single, common “utility” dimension (McFarland and Sibly, 1975). Were it not for this transformation, we would only be able to compare like with like. By mapping the value of disparate options onto a single, common dimension we are able to compare apples and oranges, chalk and cheese. Once this action evaluation has taken place, action selection is a relatively simple process of choosing, or biasing choice toward, the highest utility option. Implicit in most of these theoretical works is the idea that there is one central decision-making region in the brain.
Recently, however, this assumption has been directly challenged. In an intriguing study, Rudebeck et al. (2006) demonstrated that the frontal lobe of the rat contains two distinct decision-making areas: one specialized for intertemporal choices, the other for effort-related choices. Orbitofrontal cortex (OFC) lesions affected how long rats would wait but not how hard they would work for reward. In contrast, anterior cingulate cortex (ACC) lesions affected how hard rats would work but not how long they would wait for reward. This demonstration of two specialized decision-making regions in the frontal lobes of the rat raises the question of whether there are specialized decision-making regions in the human frontal lobe that can be anatomically dissociated.
A recent study by Blair et al. (2006) in The Journal of Neuroscience addresses this question. In particular, the authors were interested in investigating whether the ventromedial prefrontal cortex (vmPFC) and the dorsal, anterior cingulate cortex (dACC) had differentiable contributions to two aspects of decision-making: decision form and between-object reinforcement distance. Decision form is whether the decision is between only positive utility options (good vs better) or between only negative utility options (bad vs worse). Between-object reinforcement distance is the absolute difference between the utilities associated with each of the options. Thus, a choice between two options with very similar utilities would have low between-object reinforcement distance, whereas a choice between two options with very dissimilar utilities would have high between-object reinforcement distance.
Twenty-one subjects underwent functional magnetic resonance imaging (fMRI) while performing a decision-making task in which they had to win as many points as possible. On each trial, subjects had to choose one of two visually presented stimuli. Ten different stimuli were used, each with a preassigned point value, and feedback on the amount gained or lost was presented on the screen after each decision. The authors employed three decision-form conditions (punishment–punishment, punishment–reward, reward–reward) and three between-object reinforcement distances (close, medium, far). For example, punishment-punishment trials involved choosing between two options, both leading to point loss. The option with the highest expected value was designated the correct choice.
Increasing between-object reinforcement distance improved both the speed and accuracy of subjects' choices [Blair et al. (2006), their Fig. 1b (http://www.jneurosci.org/cgi/content/full/26/44/11379/F1)]. Similarly, decision-form had a large effect on both the speed and accuracy of subjects' choices, with punishment–punishment trials slower and less accurate than punishment–reward trials, and punishment–reward trials slower than reward–reward trials [Blair et al. (2006), their Fig. 1b (http://www.jneurosci.org/cgi/content/full/26/44/11379/F1)].
The fMRI data of Blair et al. revealed regional differences related to both decision-form and between-object reinforcement distance. Decision-form was represented in a variety of regions [Blair et al. (2006), their Table 1 (http://www.jneurosci.org/cgi/content/full/26/44/11379/T1)], including the right vmPFC and right dACC. The activity in the vmPFC showed a graded response for the different decision-form types, with neural activity highest for reward–reward trials and lowest for punishment–punishment trials and an intermediate response for punishment–reward trials. In contrast, the neural activity in the dACC was highest for punishment–punishment trials.
The between-object reinforcement distance was represented in the dACC and the left frontal gyrus. Activity was highest in the dACC when the between-object reinforcement distance was small.
In light of finding that the vmPFC may encode both the valence of behavioral outcome and the subsequent choice in a decision-making task (O'Doherty et al., 2003), the authors conducted a second analysis of the vmPFC data. The total reinforcement available per trial was calculated by summing the values of each of the two options presented. Neural activity in the vmPFC showed a highly significant linear relationship with the total reinforcement available in the task, with vmPFC activity increasing as the total reinforcement available increased [Blair et al. (2006), their Fig. 4a (http://www.jneurosci.org/cgi/content/full/26/44/11379/F4)]. The neural activity in the vmPFC was also correlated with the value of the option chosen [Blair et al. (2006), their Fig. 4b (http://www.jneurosci.org/cgi/content/full/26/44/11379/F4)], although this correlation was significantly weaker. The discovery that neural activity in the vmPFC encodes the value of both available options in a decision-making task points to the vmPFC playing a role in action-evaluation and perhaps represents a site at which each action is compared on single, common dimension (Shizgal and Conover, 1996).
What, then, of the dACC? This study demonstrated that the activity in the dACC was highest in the punishment–punishment condition and when between-object reinforcement distance was small. One intriguing possibility that may explain both of these observations is that the dACC is involved in signaling when errors are likely (Brown and Braver, 2005). Brown and Braver suggest that the ACC response is proportional to the perceived likelihood of an error in that condition. From the behavioral data, it is clear that subjects made the most errors in punishment–punishment trials and when the between-object reinforcement distance was small [Blair et al. (2006), their Fig. 1a (http://www.jneurosci.org/cgi/content/full/26/44/11379/F1)]. Thus, as predicted by Brown and Braver's model of ACC function (Brown and Braver, 2005), dACC neural activity is highest in the conditions when errors are most likely.
This study is an important step in unraveling the complementary but differentiable contributions of regions of the frontal lobe to decision-making. It provides evidence that the vmPFC is important in signaling the valence of all the available options, not just the chosen option, and that the dACC is involved in signaling trials when errors are likely to occur. This study highlights the need to understand decision-making not as a unitary process underpinned by a solitary brain region, but as a functional network of complementary but anatomically dissociable brain regions.
Editor's Note: These short reviews of a recent paper in the Journal, written exclusively by graduate students or postdoctoral fellows, are intended to mimic the journal clubs that exist in your own departments or institutions. For more information on the format and purpose of the Journal Club, please see http://www.jneurosci.org/misc/ifa_features.shtml.
- Correspondence should be addressed to Thomas Campbell, Department of Experimental Psychology, University of Oxford, Oxford OX1 3UD, UK.