Dopaminergic Mechanisms in Actions and Habits

Recent studies suggest new ways to interpret dopaminergic actions in goal-directed performance and habitual responding. In the early stages of learning dopamine plays an essential role, but with extended training dopamine appears to play a decreasing role in response expression. Experimental manipulation of dopamine levels alters the correlation of cortical and striatal neural activity in behaving animals, and these dopamine-dependent changes in corticostriatal correlations may be reflected in changes in action selection in the basal ganglia. Consistent with this hypothesis, changes in dopamine signaling brought about by sensitization with amphetamine mimic the transition from goal-directed to habit-based instrumental performance. At the cellular level, dopamine-dependent synaptic plasticity may be important initially, and subsequently lead to more persistent changes that no longer require dopamine. The locus of these actions within the cortical and corticostriatal circuitry is a focus on ongoing research.


Dopamine-dependent synaptic plasticity in the corticostriatal pathway
Dopaminergic modulation of synaptic plasticity in the corticostriatal pathway was suggested as a possible mechanism for learning-related effects of dopamine in the neostriatum some decades ago (Beninger, 1983;Miller, 1988;Wickens and Kotter, 1995) and demonstrated experimentally in brain slices (Wickens et al., 1996). Experimental study of dopamine-dependent synaptic plasticity in the striatum was recently reviewed (Reynolds and Wickens, 2002). Striatal long-term potentiation (LTP) is a dopamine-dependent phenomenon that requires dopamine D 1 receptor activation (Kerr and Wickens, 2001;Centonze et al., 1999). Using in vivo intracellular recording, Reynolds et al. (2001) showed that LTP can be induced by stimulation of the substantia nigra dopamine cells, using protocols that supported self-stimulation behavior in the same animals. This LTP was blocked in control animals administered a dopamine D 1 -like receptor antagonist. These findings suggest that stimulation of the substantia nigra may positively reinforce behavior by dopamine D 1 receptor-dependent potentiation of cortical inputs to the striatum. Consistent with this, recent work has shown dopamine D 1 receptor-dependent changes in neural firing patterns in the shell of the nucleus accumbens, related to anticipatory rises in dopamine concentration (Cheer et al., 2007).

Dopamine can cause changes in the coordinated activity of neuronal ensembles in corticostriatal circuits
Dopamine-dependent changes at corticostriatal synapses may have an impact on the temporal coordination of activity in corticobasal ganglia circuits. At the network level, changes in the levels of dopaminergic transmission result in rapid and profound changes in the coordinated activity of neurons in corticostriatal ensembles (Costa et al., 2006). When dopaminergic transmission is high, few neurons in the cortex and striatum display crosscorrelated activity. On the other hand, during states of low dopaminergic transmission, as occur after dopamine depletion, the cross-correlated activity in corticostriatal circuits increases (Raz et al., 2001;Goldberg et al., 2002Goldberg et al., , 2004Costa et al., 2006). Consistent with this, low dopaminergic transmission results in a large proportion of cortical and striatal neurons firing preferentially during a particular phase of the local field potential oscillation, whereas during high dopaminergic transmission few neurons fire entrained to the local field potential oscillation (Costa et al., 2006).
These dopamine-dependent changes in corticostriatal synchrony may effect changes in action selection in the basal ganglia. The fact that during high dopaminergic transmission striatal neurons are not entrained to the local field potential implies that whenever an input arrives to the striatum from the cortex, this input will most likely result in an action potential (output), regardless of the phase of the local field potential oscillation at the time of the input. This means that, during high dopaminergic transmission, the relationship between input and output is tighter, and does not depend on the local oscillations. The consequence for action selection would be that most inputs received coincidentally with high dopaminergic transmission would be relayed forward. On the other hand, most inputs arriving from the cortex during low dopaminergic transmission, especially those arriving out of phase with the preferred local field potential oscillation, would not necessarily result in an output and, therefore, the correspondence between input and output would be low (i.e., most inputs would be "gated") (Costa, 2007). This suggests that during low dopaminergic transmission it would be difficult to initiate voluntary actions to obtain particular outcomes, whereas during high dopaminergic states it would be difficult to inhibit inappropriate responses. However, training of goal-directed actions can render their performance less dopamine-dependent.

Dopamine D 1 receptor activation is important for the expression of new appetitive learning
Expression of simple appetitive responses is dependent on intact dopamine transmission during the early phases of learning, but becomes dopamine-independent with extended training. Rats under conditions of D 1 receptor blockade are strongly impaired in generating approach responses to a food compartment in the absence of a salient response-eliciting cue, but can normally initiate the same behavior when it is cued by a well trained conditioned stimulus (CS). After three daily sessions of CS-food pairings (28 trials per session), the D 1 antagonist SCH23390 disrupts both cued and noncued responding. However, after 16 daily sessions, the D 1 antagonist (1) continues to reduce the frequency of noncued head entries during the intertrial interval, but (2) produces no impairment in the latency to perform the same head entry behavior in response to the CS (Choi et al., 2005). Because animals receive only a single drug injection, the change in response vulnerability cannot be attributed to repeated drug administration.
The precise neurobiological changes that accompany habit learning remain to be revealed. However, these results suggest two intriguing possibilities. First, dopamine may play a declining role in modulating response expression within dopamineinnervated regions: for example, task-relevant corticostriatal glutamatergic postsynaptic potentials may require amplification by dopamine (Horvitz, 2002) during early stages of learning. During later stages of learning, these glutamatergic synapses may become so efficient that dopamine facilitation of glutamatergic transmission is no longer necessary for normal responding. Alternatively, the behavior may shift with training to representation by nondopamine target areas and, therefore, become less subject to dopamine modulation. It has been suggested that over the course of habit learning, learned sensory-motor representations may shift from corticostriatal-basal ganglia circuits to direct corticocortical mediation (Miller, 1988;Carelli et al., 1997;Ashby et al., 2007).

Increasing dopaminergic activity accelerates habit formation
The transition from goal-directed to habit-based instrumental performance is mimicked by sensitization with amphetamine. Nelson and Killcross (2006) demonstrated that repeated injections of amphetamine (seven daily injections, 2 mg/kg/d) in rats 1 week before moderate levels of instrumental lever-press training (three sessions of variable interval training for a total of 120 rewards) enhanced the transition of instrumental control from goal-directed to habitual, indexed by a lack of sensitivity of the instrumental response to reward devaluation by specific satiety prefeeding or with nausea-inducing injections of lithium chlo-ride in a devalued extinction test. Control, vehicle-injected animals and animals receiving sensitization with amphetamine after instrumental training showed normal devaluation of instrumental responding, indicative of goal-directed responding. In all cases, magazine approach responses remained sensitive to devaluation. These results have been extended recently to show that the effect of amphetamine sensitization on habit formation persists for a period of up to 6 weeks after the sensitization injections have ceased, emphasizing the role of chronic pretreatment over acute effects of the psychomotor stimulant, amphetamine (A. Nelson and A. S. Killcross, unpublished observation). In untreated rats, habit-based performance, as assessed by a devaluation procedure, is not normally evident until after considerably more sessions of training (typically 9 -10 sessions with some 360 -400 rewarded lever presses) (Adams, 1982;Killcross and Coutureau, 2003). This finding extends a number of previous reports demonstrating more rapid acquisition of Pavlovian appetitive approach responses after similar sensitization (Harmer and Phillips, 1998), the ability of sensitization with cocaine to render Pavlovian approach responses insensitive to reward devaluation (Schoenbaum and Setlow, 2005), and the effects of overtraining on the sensitivity of Pavlovian approach to dopamine antagonists discussed above (Choi et al., 2005).
Repeated treatment with psychomotor stimulants produces broad-ranging and enduring behavioral consequences, as well as long-term neural adaptations to regions including the striatum (dorsal and ventral), medial prefrontal cortex (Hitchcott et al., 2007), and amygdala (Robinson and Kolb, 2004). Of particular interest in the context of instrumental habit formation are potential effects within the prelimbic and infralimbic regions of the medial prefrontal cortex and the posterior dorsomedial and dorsolateral striatum, all regions shown to be involved in the transition from goal-directed to habitual control of instrumental responding Yin et al., 2004Yin et al., , 2005Faure et al., 2005). For example, recent work has demonstrated changes in the network responsiveness of striatal patch and matrix systems (Canales, 2005), with a reduction in the activation of matrix neurons and a recruitment of striosome-based pathways involving limbic prefrontal cortex and basolateral amygdala. Methamphetamine sensitization has been shown to increase spines on medium spiny neurons in the dorsolateral striatum, although reducing them in the dorsomedial subregion (Jedynak et al., 2007), mimicking the potential transition between neural substrates thought to occur in the shift of control from goal-directed to habitual systems brought about by overtraining. Different mechanisms may be involved in medial prefrontal cortex, where direct injections of dopamine have been reported to enhance the role of goal-directed over habitual responding (Hitchcott et al., 2007).

Can cellular and ensemble actions of dopamine explain a shift from goal-directed performance to habitual responding?
During the early stages of learning, when performance is goaldirected, dopamine may play a dual role. In addition to its actions on synaptic plasticity, dopamine modulates the excitability of striatal output neurons by a complex set of actions on voltagedependent and receptor-operated ion channels. The effects of strengthened corticostriatal synapses may be amplified by these actions. Accordingly, during early stages of learning, pharmacological disruptions in dopamine transmission are likely to impair behavioral performance by disrupting the throughput of information across task-relevant corticostriatal synapses. With training, and possibly through persisting changes in synaptic efficacy, the relationship between input and output could become tighter.
This may correspond to the apparently decreasing role of dopamine in response expression with extended training. With extended training, operant responses also shift from mediation by outcome value representations to stimulus-response (S-R) mode performance (Fig. 1). It seems likely that dopaminemediated LTP during learning contributes to the shift to S-R mode performance, because dopamine agonist treatment during learning speeds this shift in processing mode. However, the question of whether the shift to S-R responding and the shift to dopamine-independent response expression coincide remains to be determined. Figure 1. Strengthening corticostriatal connections can lead to increased correlation of cortical and striatal activity, and a shift to S-R responding. The upward arrow indicates that the glutamatergic corticostriatal synapse (GLU) is strengthened by dopaminergic activity (DA), which may also facilitate information transmission by changing the excitability of striatal cells. Connections are shown from the cortex to the striatum and back to cortex via the globus pallidus and thalamus (unlabeled). One possibility would be that DA-mediated strengthening of taskrelevant corticostriatal synapses would eventually render striatal neurons able to fire in the absence of concomitant dopamine release. This strengthening could, with repetition, extend to different striatal subregions than the ones encoding the initial goal-directed behavior (Miyachi et al., 1997;Yin et al., 2004), which would account for the findings that performance of the well established behavior is no longer outcome mediated and less DA dependent. Another possibility would be that direct S-R connections (dotted line) are formed, for example, by the establishment of corticocortical connections (Ashby et al., 2007;Carelli et al., 1997). Neurotransmission via these latter connections would also account for the finding that performance of the well established behavior is no longer outcome mediated.