Abstract
Although the involvement in the striatum in the refinement and control of motor movement has long been recognized, recent description of discrete frontal corticobasal ganglia networks in a range of species has focused attention on the role particularly of the dorsal striatum in executive functions. Current evidence suggests that the dorsal striatum contributes directly to decision-making, especially to action selection and initiation, through the integration of sensorimotor, cognitive, and motivational/emotional information within specific corticostriatal circuits involving discrete regions of striatum. We review key evidence from recent studies in rodent, nonhuman primate, and human subjects.
- choice
- utility
- frontal cortex
- executive
- reward
- striatum
To choose appropriately between distinct courses of action requires the ability to integrate an estimate of the causal relationship between an action and its consequences, or outcome, with the value, or utility, of the outcome. Any attempt to base decision-making solely on cognition fails fully to determine action selection because any information, such as “action A leads to outcome O,” can be used both to perform A and to avoid performing A. It is interesting to note in this context that, although there is an extensive literature linking the cognitive control of executive functions specifically to the prefrontal cortex (Goldman-Rakic, 1995; Fuster, 2000), more recent studies suggest that these functions depend on reward-related circuitry linking prefrontal, premotor, and sensorimotor cortices with the striatum (Chang et al., 2002; Lauwereyns et al., 2002; Tanaka et al., 2006).
Importantly, evidence from a range of species suggests that this corticostriatal network controls functionally heterogeneous decision processes involving (1) actions that are more flexible or goal directed, sensitive to rewarding feedback, and mediated by discrete regions of association cortices particularly medial, orbitomedial, premotor, and anterior cingulate cortices together with their targets in caudate/dorsomedial striatum (Haruno and Kawato, 2006; Levy and Dubois, 2006); and (2) actions that are stimulus bound, relatively automatic or habitual, and mediated by sensorimotor cortices and dorsolateral striatum/putamen (Jog et al., 1999; Poldrack et al., 2001). These processes have been argued to depend on distinct learning rules (Dickinson, 1994) and, correspondingly, distinct forms of plasticity (Partridge et al., 2000; Smith et al., 2001). Furthermore, degeneration in these corticostriatal circuits has been linked to distinct forms of psychopathology, e.g., in Huntington's, obsessive compulsive disorder, and Tourette's syndrome on the one hand (Robinson et al., 1995; Bloch et al., 2005; Hodges et al., 2006) and in Parkinson's and multiple system atrophy on the other (Antonini et al., 2001; Seppi et al., 2006).
Here, we review recent evidence implicating the dorsal striatum in decision-making and point to the considerable commonalities in the functionality of this region in rodent, nonhuman primate, and human subjects.
Instrumental conditioning in rats
Behavioral research over the last two decades has identified forms of learning in rodents homologous to goal-directed and habitual learning in humans. This suggestion is based on extensive evidence that choice between different actions, e.g., pressing a lever or pulling a chain when these actions earn different food rewards, is determined by the animals' encoding the association between a specific action and outcome and the current value of the outcome; choice is sensitive both to degradation of the action–outcome contingency and to outcome revaluation treatments (Dickinson and Balleine, 1994; Balleine and Dickinson, 1998). In contrast, when actions are overtrained, decision processes become more rigid or habitual; performance is no longer sensitive to degradation and devaluation treatments but rather is controlled by a process of sensorimotor association (Dickinson, 1994; Dayan and Balleine, 2002). As such, whereas action–outcome encoding appears to be mediated by a form of error-correction learning rule, the development of habits is not (Dickinson, 1994); indeed, traditionally this form of learning has been argued to be sensitive to contiguity rather than contingency (Dickinson et al., 1995).
Recent experiments have started to reveal differences in the circuitry associated with these distinct forms of decision process in rodents (see Fig. 1). Cell body lesions of prefrontal cortex, particularly the dorsal prelimbic (Balleine and Dickinson, 1998; Corbit and Balleine, 2003), but not infralimbic (Killcross and Coutureau, 2003), region abolishes the acquisition of goal-directed actions, and performance is acquired by sensorimotor association alone. Prelimbic involvement in goal-directed learning is, however, limited to acquisition (Ostlund and Balleine, 2005; Hernandez et al., 2006), suggesting that this cortical region is involved in learning-related plasticity localized to an efferent structure. Prelimbic cortex projects densely to both dorsomedial striatum and accumbens core (Gabbott et al., 2005) and, although well controlled studies assessing specifically goal-directed learning suggests that the latter region is not involved in this learning process (Corbit et al., 2001), recent evidence has implicated dorsomedial striatum (Balleine, 2005). Thus, both pretraining and posttraining lesions (Yin et al., 2005c), muscimol-induced inactivation (Yin et al., 2005c), and the infusion of the NMDA-antagonist AP5 (Yin et al., 2005b) within a posterior region of dorsomedial striatum all abolish goal-directed learning and render choice performance insensitive to both contingency degradation and outcome devaluation treatments, i.e., choice becomes rigid and habitual (Yin et al., 2005a).
Interestingly, evidence suggests that a parallel corticostriatal circuit involving infralimbic and sensorimotor cortices together with the dorsolateral striatum in rodents may mediate the transition to habitual decision processes associated with sensorimotor learning (Jog et al., 1999; Killcross and Coutureau, 2003; Barnes et al., 2005). Whereas the infralimbic region has been argued to mediate aspects of the reinforcement signal controlling sensorimotor association (Balleine and Killcross, 2006), changes in motor and dorsolateral striatum appear to be training related (Costa et al., 2004; Hernandez et al., 2006; Tang et al., 2007) and to be coupled to changes in plasticity as behavioral processes become less flexible (Costa et al., 2006; Tang et al., 2007). Correspondingly, whereas overtraining causes performance to become insensitive to outcome devaluation, lesions of dorsolateral striatum reverse this effect rendering performance goal-directed and once again sensitive to outcome devaluation treatments (Yin et al., 2004). Likewise, muscimol inactivation of dorsolateral striatum renders otherwise habitual performance sensitive to changes in the action–outcome contingency (Yin et al., 2005a). Current evidence suggests, therefore, that, whereas stimulus-mediated action selection is mediated by this lateral corticostriatal circuit, the more flexible, outcome-mediated action selection subserving goal-directed action is mediated by a more medial corticostriatal circuit consistent with the general claim that distinct corticostriatal networks control different forms of decision process (Daw et al., 2005).
Striatal-based learning processes in nonhuman primates
Similarly, recent studies using nonhuman primates have suggested that the striatum may be an important brain area in decision-making. A clue to this hypothesis came from single-unit recording studies using trained animals. Neurons responding to task-related sensory events that become active before task-related motor behaviors and that are tonically active until expected rewards are delivered have been described in a circumscribed region of dorsal striatum (Hikosaka et al., 1989; Hollerman et al., 1998). Importantly, the activity of these neurons has been found to be modulated by the expected presence, amount, or probability of reward or by the magnitude of attention or memory required to execute the task (Kawagoe et al., 1998; Shidara et al., 1998; Cromwell and Schultz, 2003). The coexpression of the sensorimotor, cognitive, and motivational/emotional signals in single neurons provides conditions favorable for learning. Some of the recent studies addressed this question more directly by examining neuronal activity while the animals were learning new sensorimotor associations, new motor sequences, or adapting to new reward outcomes. They found that, indeed, many striatal neurons decrease or increase their activity as learning progressed (Tremblay et al., 1998; Blazquez et al., 2002; Miyachi et al., 2002; Hadj-Bouziane and Boussaoud, 2003; Brasted and Wise, 2004). Notably, visuomotor (saccadic) activity appropriate for the correct saccadic response appeared earlier in the associative striatum (caudate nucleus) than in the dorsolateral prefrontal cortex during the course of learning (Pasupathy and Miller, 2005).
The results of single-unit recordings were further supported by experimentally manipulating learning-related neuronal activity. An orthodox method was to suppress activity of neurons in a small functional area in the striatum by injecting GABA agonists such as muscimol. Consistent with dissociations observed in rodents, the suppression of the anterior, associative striatum disrupted learning of new sequential motor procedures, whereas the suppression of the putamen disrupted execution of well learned motor sequences (Miyachi et al., 1997). A second method has sought to promote learning by electrically stimulating the striatum (Nakamura and Hikosaka, 2006b; Williams and Eskandar, 2006). Importantly, this was effective only when the stimulation was applied just after the animal executed a motor response correctly. The timing specificity raises the possibility that sensorimotor, cognitive, and motivational/emotional signals reach single striatal neurons concurrently at the time around the motor execution to cause plastic changes in synaptic mechanisms.
These experiments on learning have been conducted mostly in the dorsal striatum in which motivational/emotional signals from cortex are thought to be sparse. Rather, it appears more likely that motivational signals are supplied by dopaminergic inputs originating mainly from the substantia nigra pars compacta. It has been hypothesized that dopaminergic neurons encode a mismatch between the expected and the actual reward value (Schultz, 1998). This so-called reward prediction error signal is ideal for guiding learning until the gain of reward is maximized (Houk et al., 1995).
Indeed, studies on the synaptic mechanisms in the striatum have shown that long-term potentiation (LTP) or long-term depression (LTD) can occur in the corticostriatal synapses depending on the combination of cortical inputs, striatal outputs, and D1 and D2 dopaminergic inputs (Reynolds and Wickens, 2002). A study using behaving animals showed that these mechanisms are necessary for behavioral adaptation of motor behavior to changing reward-position contingencies (Nakamura and Hikosaka, 2006a). Local injections of a D1 antagonist into the caudate nucleus lengthened the motor (saccadic) reaction times when large rewards were expected, whereas injections of a D2 antagonist lengthened the reaction times when small rewards were expected.
Action–contingency learning in the human dorsal striatum
Studies in humans corroborate the research in animals suggesting that the dorsal striatum is an integral part of a circuit involved in decision-making. Accumulating evidence, primarily from neuroimaging but also neuropsychological investigations, has implicated the dorsal striatum in different aspects of motivational and learning processes that support goal-directed action. For instance, positron emission tomography (PET) studies report increases in dopamine release in the dorsal striatum (as measured by displacement of endogenous dopamine by radioligands) when participants are presented with potential rewards, such as the opportunity to gain money (Koepp et al., 1998; Zald et al., 2004) or even when presented with food stimuli while in a state of hunger (Volkow et al., 2002). Similarly, fMRI studies typically report increases in blood oxygenation level dependent (BOLD) responses in the dorsal striatum during anticipation of either primary (O'Doherty et al., 2002) or secondary (Knutson et al., 2001) rewards, much like the ventral striatum.
What distinguishes the human dorsal striatum from the rest of the basal ganglia is its involvement in action-contingent learning (Delgado et al., 2000, 2005a; Knutson et al., 2001; Haruno et al., 2004; O'Doherty, 2004; Tricomi et al., 2004). Similar to the animal literature (Ito et al., 2002; Yin et al., 2005c), learning about actions and their reward consequences involves the dorsal striatum (O'Doherty et al., 2004; Tricomi et al., 2004), as opposed to more passive forms of appetitive learning found to depend on the ventral striatum (O'Doherty, 2004). These results mirror neuropsychological studies conducted on patients afflicted with Parkinson's disease, who are impaired in their learning of probabilistic stimuli when action contingencies are present (Poldrack et al., 2001) but are unimpaired when no contingency between action and outcome exists (Shohamy et al., 2004). Within the human dorsal striatum, learning of action–reward associations has been found in both putamen and caudate nucleus with potentially different roles based on their sensorimotor or associative connectivity, respectively (Alexander and Crutcher, 1990). Some studies argue, for example, that the putamen is important for stimulus–action coding (Haruno and Kawato, 2006). In contrast, a number of studies suggest that the head of the caudate nucleus is involved in coding reward-prediction errors during goal-directed behavior (Davidson et al., 2004; O'Doherty et al., 2004; Delgado et al., 2005a; Haruno and Kawato, 2006).
More recently, neuroimaging studies have extended these general ideas on the function of the human dorsal striatum to more complex social issues. Increases in BOLD responses in the dorsal striatum, for example, have been reported when an interactive social component exists, such as the occurrence of cooperation (Rilling et al., 2002) or revenge (de Quervain et al., 2004). The caudate nucleus has also been implicated in the acquisition of social reputations (via reciprocity in an economic exchange game, the “trust game”) through trial and error (King-Casas et al., 2005). However, existing social biases (e.g., knowledge about moral characteristics) can also hinder corticostriatal learning mechanisms and influence subsequent decisions (Delgado et al., 2005b). A future challenge for researchers is, therefore, to understand the role of the dorsal striatum in goal-directed behaviors with respect to the vast array of existing social complexities.
Summary and future directions
Together, this recent evidence suggests that the dorsal striatum mediates important aspects of decision-making, particularly those related to encoding specific action–outcome associations in goal-directed action and the selection of actions on the basis of their currently expected reward value. We have summarized these findings and the major trends in research that they imply in Figure 1. These findings are consistent with computational theories of adaptive behavior, notably forms of reinforcement learning, and when considered in the context of current views of the broader corticobasal ganglia system, appear likely to provide the basis for an integrated approach to striatal function.
Important questions still remain, such as the anatomical and functional similarities across species, particularly with respect to the cognitive control of actions, whether the striatum is the only site at which goal-directed learning occurs (Hikosaka et al., 2006) and whether dopamine is the sole teacher of this learning (Aosaki et al., 1994; Seymour et al., 2005). Is the striatum involved only in reward-based learning, or does it contribute to the inhibition of responses associated with aversive consequences (Hikosaka et al., 2006)? Finally, is the product of learning (e.g., the motor memory) stored in the striatum? The differential effects of muscimol injections suggests one of two things: either the associative striatum guides acquisition and the motor memory is stored in the putamen (perhaps in addition to other motor areas), or these regions are involved in quite distinct motor processes; the associative striatum encoding a more abstract relationship between actions and their consequences and the putamen, which is connected with motor cortical areas, encoding actions in the muscle–joint domain that, after extensive practice, become resistant to changes in outcome: the hallmark of habitual behavior.
Footnotes
- Received April 6, 2007.
- Revision received June 4, 2007.
- Accepted June 4, 2007.
-
This work was supported by National Institute of Mental Health Grant 56446 (B.W.B.) and by the National Eye Institute Intramural Research Program (O.H.).
- Correspondence should be addressed to Bernard W. Balleine, Department of Psychology, Box 951563, Los Angeles, CA 90095-1563. balleine{at}psych.ucla.edu
- Copyright © 2007 Society for Neuroscience 0270-6474/07/278161-05$15.00/0