Abstract
The development of drug-seeking habits is implicated in the transition from recreational drug use to addiction. Using a drug seeking/taking chained schedule of intravenous cocaine self-administration and reward devaluation methods in rats, the present studies examined whether drug seeking that is initially goal-directed becomes habitual after prolonged drug seeking and taking. Devaluation of the outcome of the drug seeking link (i.e., the drug taking link of the chained schedule) by extinction significantly decreased drug seeking indicating that behavior is goal-directed rather than habitual. With, however, more prolonged drug experience, animals transitioned to habitual cocaine seeking. Thus, in these animals, cocaine seeking was insensitive to outcome devaluation. Moreover, when the dorsolateral striatum, an area implicated in habit learning, was transiently inactivated, outcome devaluation was effective in decreasing drug seeking indicating that responding was no longer habitual but had reverted to control by the goal-directed system. These studies provide direct evidence that cocaine seeking becomes habitual with prolonged drug experience and describe a rodent model with which to study the neural mechanisms underlying the transition from goal-directed to habitual drug seeking.
Introduction
Studies of instrumental conditioning in rats have shown that two distinct processes control actions that are instrumental to gaining access to rewarding stimuli. During early learning, behavior is goal-directed, dependent on action-outcome associations and performance is readily influenced by changes in outcome value. As training proceeds, control over performance shifts to a stimulus–response process. Actions become habitual, and insensitive to changes in instrumental contingency or reward value (Adams and Dickinson, 1981; Dickinson, 1985; Balleine and Dickinson, 1998). As a consequence, assessment of changes in operant responding after outcome devaluation is used to differentiate goal-directed from habit-driven behaviors (Adams and Dickinson, 1981; Colwill and Rescorla, 1985; Dickinson, 1985).
Habit learning processes have been implicated in the transition from recreational drug use to the compulsive drug seeking that characterizes addiction (White, 1996; Everitt and Robbins, 2005; Everitt et al., 2008). Lesions of the dorsolateral striatum, an area necessary for habit learning (Packard and Knowlton, 2002; Yin et al., 2004), attenuate cue controlled drug seeking (Ito et al., 2002; Vanderschuren et al., 2005; Fuchs et al., 2006; Belin and Everitt, 2008) and nigrostriatal dopamine lesions disrupt habit formation (Faure et al., 2005). Moreover, prior repeated psychostimulant administration facilitates habitual responding for food reinforcers (Schoenbaum and Setlow, 2005; Nelson and Killcross, 2006; Nordquist et al., 2007) suggesting that cocaine administration accelerates the development of habits.
Studies assessing the development of habitual drug seeking are limited. Using satiation or pairing of oral reinforcers with LiCl-induced sickness, two studies have demonstrated habitual drug seeking of orally administered drug reinforcers (Dickinson et al., 2002; Miles et al., 2003). However, studies of intravenous drug administration have been hampered by difficulties inherent in adapting standard devaluation procedures for natural reinforcers to those of intravenously administered drug. In contrast to natural reinforcers, this route of administration is not associated with an obvious consummatory response and psychostimulants such as cocaine have unconditioned, behaviorally activating, effects that can affect responding.
In an elegant series of studies, Olmstead et al. (2001) circumvented these issues by use of a heterogeneous chained schedule of intravenous cocaine administration in which habitual cocaine seeking was tested by devaluing the final link of a drug seeking/taking chained schedule. In this procedure, responding on the designated drug seeking lever provides access to a drug taking lever, rather than to cocaine itself. Responses on the drug taking lever result in a cocaine infusion. Devaluation of the drug taking link (by extinction) was assessed once performance under the chained schedule had stabilized. Decreased responding during the drug seeking link was observed indicating that behavior was goal-directed, rather than habitual. Fundamental questions, however, remain as to whether extended training on the chained schedule produce habitual drug-seeking responses that are insensitive to devaluation. Such information is important for current theories of addiction and for identifying the contingencies that control drug self-administration.
Accordingly, in the present studies we have used an adaptation of the seeking/taking chained schedule of Olmstead et al. (2001) to address this issue. Given the documented role of the dorsolateral striatum in stimulus–response associations (Yin et al., 2004), the influence of inactivation of this brain area was assessed in animals with a prolonged history of cocaine seeking and taking.
Materials and Methods
Subjects
Male Long–Evans rats (Charles River Laboratories) weighing 300 g at the beginning of experiments were housed 2–3 per cage for at least 1 week before use in facilities accredited by the American Association for the Accreditation of Laboratory Animal Care. They were maintained in a temperature- and humidity-controlled environment under a reverse 12 h light/dark cycle with food and water available ad libitum. One week after surgery, food was restricted to 15 g of standard rodent diet per day and made available after the daily self-administration sessions. Experiments were conducted during the dark cycle. All experiments were approved by the Institutional Care and Use Committee of the National Institute on Drug Abuse (NIDA), National Institutes of Health (NIH) (Rockville, MD) and conducted in accordance with the Guide for the Care and Use of Laboratory Animals provided by the NIH and adopted by the NIDA Intramural Research Program.
Surgery
Rats were anesthetized with equithesin (1% pentobarbital, 2% magnesium sulfate, 4% chloral hydrate, 42% propylene glycol, 11% ethanol, 3 ml/kg, i.p.) and a SILASTIC catheter (inner diameter, 0.020 inch; outer diameter, 0.037 inch; Dow Corning) was advanced 3.5 cm into the right jugular vein. The catheter terminated in an L-shaped steel tube mounted on top of the skull with cranioplastic cement and was secured with 3 stainless steel screws. Catheters were flushed daily with 0.1 ml of heparinized saline to maintain patency. Animals were allowed to recover for at least 1 week after the surgery and before food restriction commenced. Catheters were flushed once daily (0.1 ml) with 25 mg/ml gentamicin for 5 d after surgery and then with 120 mg/ml cefazolin throughout the study. Operant training started 2–3 d after the beginning of food deprivation.
Subjects in experiment 3 were stereotaxically implanted with bilateral guide cannulae (C315, Plastics One) aimed 1 mm dorsal to the dorsolateral striatum (anteroposterior, +0.5; lateral, ±3.6; ventral, −3.6 mm relative to bregma).
Behavioral procedures
Experiment 1.
Operant training took place in rat operant chambers equipped with 2 retractable levers (Med Associates). The training protocol was adapted from Olmstead et al. (2001). Animals were trained to lever press one lever (designated as the drug taking lever) for intravenous cocaine infusions under a fixed ratio 1 (FR1) schedule. Each lever press resulted in infusion of 0.75 mg/kg cocaine accompanied by retraction of the taking lever, extinction of the house light and illumination of a cue light above the taking lever for 30 s. Animals were allowed to self-administer for a maximum of 40 infusions or 2 h. After reliable self-administration was established (2 consecutive sessions >10 infusions/session), simultaneous presentation of sucrose (10 μl of 20% sucrose via a liquid dipper) was initiated. The dipper cup containing sucrose, which was located in the food magazine, remained in the up position. The first head entry into the dipper magazine after a random interval 60 s (RI60) schedule had elapsed resulted in refilling of the sucrose cup. Head entries into the dipper magazine were recorded as a measure of a Pavlovian approach response (Corbit and Balleine, 2005) and used to control for nonspecific effects on general activity. Cocaine self-administration under the FR1 (fixed ratio 1) schedule continued with simultaneous access to sucrose under a RI60 schedule until performance stabilized (3 consecutive sessions >20 infusions and <20% between-session variation). A chained schedule was then introduced for cocaine self-administration. Every infusion cycle started with the insertion of a second lever, designated as the drug seeking lever. The first lever press on the seeking lever after completion of an RI2 s schedule resulted in retraction of the seeking lever and extension of the taking lever. The first response on the taking lever resulted in the cocaine infusion sequence (e.g., infusion of 0.75 mg/kg cocaine followed by retraction of the taking lever, extinction of the house light and illumination of the cue light above the taking lever). A time-out of 30 s (TO30) was imposed after every infusion, after which another infusion cycle started with the presentation of the drug seeking lever. Thus, animals self-administered cocaine under an RI2/FR1:TO30 chained cocaine seeking/taking schedule. Each daily session lasted 3 h or 12 infusions, whichever happened first. The RI and TO components of the chained schedule were then increased: RI2/FR1:TO30–RI20/FR1:TO120–RI60/FR1:TO300–RI120/FR1:TO600. Animals received at least 2 daily sessions on each schedule and were allowed at least 6 sessions on the final RI120/FR1:TO600 schedule. Sucrose was simultaneously available on the RI60 schedule throughout the experiment. After stable responding on the final chained schedule was achieved, drug taking was devalued by daily 2 h extinction sessions. In these sessions only the drug taking lever was present and lever responding had no consequences. Sucrose was still available. After 13 drug taking extinction sessions, drug seeking was assessed in a 5 min test. During this test only the drug seeking lever was present and responses had no consequences. Drug taking was then revalued in two sessions in which animals were allowed to earn cocaine infusions under an FR1 schedule (40 infusions or 2 h, whichever came first) with only the drug taking lever present. A second 5 min drug seeking test was then administered. Thus, cocaine seeking was assessed in two identical tests, after devaluation and again after revaluation of the outcome of the cocaine seeking lever (the drug taking lever).
Experiment 2.
The protocol was identical to that in experiment 1 except that after completing the cocaine seeking test, an additional 36 sessions (RI120/FR1:TO600; 3 sessions/d × 12 d) in which rats received 12 infusions/session were conducted. Each of the three daily sessions was separated by 120 min and occurred during the light phase of the reverse light/dark cycle. Thereafter, extinction of the drug taking lever followed by cocaine seeking tests under devalued and revalued conditions was performed exactly as described for initial training. In this way, the effects of devaluation of the drug taking lever on drug seeking were evaluated in an identical manner both after initial limited training and after extensive drug experience.
Experiment 3.
The protocol was identical to that of experiment 2, with the exception that no extinction and drug seeking tests were given after short training. Animals transitioned to the extended training protocol immediately after stable baseline responding was achieved for 6 sessions. Extended training, which occurred during the light phase of the light/dark cycle, consisted of 30 sessions (3 sessions/d, 2 h apart) and was followed by extinction of the taking lever and the corresponding cocaine seeking tests. The dorsolateral striatum was transiently inactivated in half of the animals by microinjecting 1 μl of 4% lidocaine bilaterally 5 min before both cocaine seeking tests (e.g., devalued and revalued conditions). Control animals received an equivalent volume of aCSF. For microinjections, a microinjection cannula (C315I, Plastics One) which extended 1 mm below the tip of the guide was inserted into the guide cannula. One microliter of solution was infused over 1 min. The microinjection cannula was removed 2 min later to allow for diffusion of the solution into the tissue.
Histology
At the end of experiment 3, cresyl violet (1% × 1 μl) was bilaterally microinjected into the guide cannulae. The animals were then killed and brains removed and frozen on dry ice. Histological verification of the microinjection site was performed in 30 μm frozen sections.
Data analysis
Statistical analysis was conducted using a paired t test to compare cocaine-seeking responses under devalued and revalued conditions (experiment 1). A two-way repeated-measures ANOVA with value of the cocaine taking response as the within-subject factor and duration of training (experiment 2) or lidocaine treatment (experiment 3) as the other main factor was used to assess the effects of training history or striatal inactivation on the devaluation of cocaine-seeking responses.
Results
Experiment 1
Animals learned the RI120/FR1:TO600 chained schedule in 20–25 sessions and earned 12 cocaine infusions per session in <3 h. Withholding cocaine infusions extinguished responses on the drug taking lever (Fig. 1A). In the absence of cocaine infusions, however, the number of sucrose reinforcers earned increased threefold and remained elevated throughout extinction. Both drug taking responses and the number of sucrose reinforcers earned returned to pre-extinction levels when cocaine access, under an FR1 schedule, was resumed. To determine whether devaluation of the cocaine taking response by extinction affected the cocaine seeking link, responding during the two drug seeking tests (devalued and revalued conditions) were compared (Fig. 1B). A paired t test indicated significantly lower cocaine seeking responses under devalued conditions (t = 2.60, df = 13, p = 0.022). Devaluation of the cocaine taking link of the chain did not significantly alter head entries in the sucrose magazine, suggesting that the decreased cocaine seeking under devalued conditions was not due to a generalized decrease in activity after prolonged extinction. Cocaine seeking after devaluation of the drug taking chain was 70 ± 11% of cocaine seeking under revalued conditions (Fig. 1C). However, examination of each animal's data from the cocaine seeking tests revealed devaluation effects ranging from 14% to 138%. Of the 14 subjects tested, drug seeking in 8 animals was clearly attenuated by outcome devaluation as evidenced by a 33–86% decrease in responding. However, responding of 6 animals after devaluation was >80% of that under the revalued condition, indicating poor sensitivity to devaluation in these animals. Thus, in this subset of animals (43% of the total), cocaine seeking was not goal-directed but habitual in nature. In experiment 2 we assessed whether the number of animals exhibiting habitual cocaine seeking increased with more extended cocaine seeking experience.
A, Devaluation testing: experiment 1. After stable acquisition of the drug seeking/taking chained schedule, the cocaine taking link was devalued by extinction. Responses on the drug seeking lever and head entries into the sucrose magazine were measured under extinction conditions during 5 min tests before and after revaluation of the drug taking link (arrows) by two sessions where cocaine infusions were available again under an FR1 schedule. B, Cocaine seeking responses and head entries into the sucrose magazine during tests conducted under devalued and revalued conditions. C, The magnitude of the devaluation effect was calculated as cocaine seeking responses after devaluation expressed as a percentage of seeking responses under the revalued condition for each animal (filled circles, n = 14). Percentage of seeking responses under the revalued condition is plotted for each animal. The average ± SEM for the group (solid horizontal line) is also shown. All data are expressed as the mean ± SEM. *p < 0.05, paired t test.
Experiment 2
To investigate whether extended training would result in development of insensitivity of cocaine seeking to outcome devaluation (i.e., extinction of the cocaine taking link) rats were trained as in experiment 1 but received 36 additional drug seeking/taking sessions. Cocaine and sucrose seeking remained relatively stable throughout the extended training procedure (Fig. 2A). However, in contrast to experiment 1, the sensitivity of cocaine seeking to devaluation of the drug taking link was significantly reduced after extended training (Fig. 2B). Statistical analysis revealed a significant training experience vs devaluation interaction, F(1,5) = 7.659, p = 0.04). Head entries into the sucrose magazine were not significantly affected by devaluation of cocaine taking regardless of training experience (Fig. 2C, F(1,5) = 1.0, p = 0.36), arguing against nonspecific changes in general activity. The percentage change in cocaine seeking for individual animals after devaluation is shown in Figure 2D. Examination of the short training condition data suggest that the significant devaluation effect for the group average, again resulted from a subgroup of animals in which cocaine seeking decreased in excess of 50% after devaluation (Fig. 2D, filled circles). The remaining animals did not show any apparent devaluation effect indicating that responding of these animals was no longer goal-directed (Fig. 2D, open circles). Analysis of both subgroups after additional training revealed a loss of the devaluation effect in the devaluation-sensitive subgroup (Fig. 2E, filled circles). In contrast, little change in the performance of the devaluation insensitive subgroup was seen (Fig. 2E, open circles). These data are consistent with a shift from goal-directed (sensitive to changes in outcome value) to habitual cocaine seeking behavior (insensitive to changes in outcome value) after extended drug seeking experience.
Effects of training duration on sensitivity to devaluation: experiment 2. A, Devaluation tests were performed after short training and again after extended training of the drug seeking/taking chained schedule (arrows). Cocaine seeking responses and head entries into the sucrose magazine for each session, consisting of 12 drug infusions, are shown. Dotted lines indicate each block of 3 daily sessions. The devaluation tests followed the same procedure as in experiment 1. B, Seeking responses for cocaine during the test. C, Head entries into the sucrose magazine during the test. D, The magnitude of the devaluation effect was calculated as cocaine seeking responses after devaluation expressed as a percentage of seeking responses under the revalued condition for each animal. Animals were assigned to two groups according to their sensitivity to devaluation after short training [3 animals showed no effect of devaluation (open circles) and 3 animals devalued 40% or more (filled circles)]. E, The average ± SEM devaluation effect for each of these groups is shown. All data are expressed as the mean ± SEM.
Experiment 3
Since the dorsolateral striatum is required for habitual behavior (Yin et al., 2004; Yin and Knowlton, 2006), we tested whether transient inactivation of this brain region would block habitual cocaine seeking and reinstate sensitivity to outcome value. After completing the extended training, animals were assigned to two groups with similar cocaine seeking behavior (Fig. 3A). The drug taking response was then extinguished as in experiments 1 and 2. However, the dorsolateral striatum was inactivated before each seeking test by local bilateral infusion of 4% lidocaine. Control animals received aCSF infusions. Three rats were eliminated from the study due to failure to respond under revalued conditions (1, 3, and 3 responses, respectively). Two factor repeated-measures ANOVA revealed a significant main effect of devaluation (F(1,18) = 26.17, p < 0.0001) and a significant infusion × devaluation interaction (F(1,18) = 5.21, p = 0.035) indicating that the effect of devaluation varied as a function of intrastriatal infusion (Fig. 3B). Post hoc analysis revealed a significant reduction in drug-seeking responses only in rats that had received infusion of lidocaine before testing (p < 0.01, Student-Newman–Keuls test). No significant devaluation effect was detected in response to aCSF infusion (p > 0.05, Student-Newman–Keuls test), consistent with habitual responding in this group. Head entries in the sucrose magazine were not affected by devaluation of the cocaine taking link (Fig. 3C, F(1,18) = 2.11, p = 0.16). There was a trend toward increased generalized responding for sucrose in the lidocaine-treated animals regardless of the value of the cocaine taking link, but this trend was not significant (F(1,18) = 3.18, p = 0.09).
Influence of transient inactivation of the dorsolateral striatum on devaluation: experiment 3. Devaluation testing was performed after an extended training procedure identical to that used in experiment 2. Animals were divided into 2 groups with similar performance during the drug seeking/taking schedule. A, Cocaine-seeking responses for each session, consisting of 12 drug infusions, are shown. Dotted lines separate each block of 3 daily sessions. The devaluation tests took place only at the end of the extended training (arrow) and followed the same procedure as in experiment 1, except that animals received infusions of aCSF or 4% lidocaine in the dorsolateral striatum immediately before the drug seeking tests. B, Cocaine seeking responses during testing. C, Head entries responses during testing. D, Histological reconstruction of cannula placements (open circles, aCSF; filled circles, 4% lidocaine). All data are mean ± SEM. Number of animals per group is shown in parenthesis. *p < 0.01, Student–Newman–Keuls post hoc test.
Discussion
It has been proposed that habitual drug seeking develops after extended drug experience and that this contributes to the development of the compulsive drug seeking that characterizes addiction (Everitt et al., 2008). Using a chained schedule of cocaine seeking and taking, the present study provides direct evidence that after extended—but not limited—intravenous cocaine self-administration, drug seeking becomes habitual and under the control of stimulus–response associations.
In this study, goal-directed versus habitual drug seeking was tested by devaluing the drug taking link of the chained schedule rather than the unconditioned effects of the drug. This procedure differs from standard reward devaluation techniques for oral reinforcers in which the unconditioned reinforcers are devalued by poisoning or satiation. It has been argued that activities directed to the procurement of drug (drug seeking) differ from the more immediate, consummatory responses of drug taking (O'Brien et al., 1998). The distinction between seeking and taking responses is important on theoretical and experimental grounds (Olmstead et al., 2001). Indeed, experimental evidence indicates the involvement of distinct processes in seeking and taking responses (Balleine et al., 1995; Arroyo et al., 1998; Ito et al., 2004). Separation of drug seeking from drug taking in the chained schedule used in the present studies allows investigation of habitual drug seeking maintained by cocaine associated stimuli, which is thought to be critical in human addicts (O'Brien et al., 1998), in the absence of the unconditioned effects of drug on operant responding (Everitt and Robbins, 2000).
Analogous to Olmstead et al. (2001), after a moderate amount of training, drug seeking is an action under control of the response-outcome association. Thus, extinction of the drug-taking link of the chain resulted in a significant decrease in responding on the cocaine-seeking lever. Devaluation tests were conducted once stable performance on the RI120/FR1:TO600 chained scheduled was achieved (e.g., 20–25 sessions). Importantly, the use of the variable interval schedule in the drug seeking link of the chain generated high levels of responding (15–30 seeking responses/infusion earned). As a consequence, animals emitted 200–300 responses per session during the short training condition. Although previous work has shown that variable interval schedules are more effective in promoting the development of habits than commonly used fixed ratio schedules (Dickinson et al., 1983), this amount of operant experience was still insufficient to elicit habitual responding. Such findings are noteworthy in that they suggest that self-administration studies in which continuous reinforcement schedules and limited access conditions are used may not provide sufficient drug seeking experience to engage the habit system. Consistent with this contention, only when the number of training sessions was increased (36 additional sessions) was a shift to habitual drug seeking seen (e.g., responding became insensitive to outcome devaluation).
Accumulating evidence indicates that addiction is a disorder in which casual drug use progresses to habitual and ultimately compulsive drug seeking and taking (Everitt and Robbins, 2005). By enabling differentiation of responding that is goal-directed from that which is habitual, the present studies suggest that use of a chained schedule of intravenous cocaine self-administration provides an effective animal model with which to study the mechanisms involved in the development of drug seeking habits (Koob et al., 2004; Vanderschuren and Everitt, 2004; Zernig et al., 2007).
Devaluation of the cocaine taking link of the chain did not decrease approach responses to the sucrose compartment. It should also be noted that the head entry response is not a “pure” instrumental response and also consists of a Pavlovian approach component. Head entries into a food magazine have been used to control for nonspecific response suppression by punishment in studies using drug seeking/taking schedules of cocaine self-administration (Pelloux et al., 2007). In the present study, head entries increased during extinction of the drug taking link. This finding is consistent with the observation that suppression of cocaine responding increases responding for food (Negus, 2005). Furthermore, the lack of effect of the devaluation procedure on sucrose approach responses during the seeking test indicates that a decrease in drug seeking responding after devaluation or after intrastriatal lidocaine infusions is not the result of a generalized decrease in motor performance due to habituation, sedation or motor impairment.
Examination of each animal's performance during devaluation tests after the moderate training condition revealed that, although drug seeking was goal-directed for the experimental group as a whole, a subset of animals already displayed habitual responding. At present, it is unclear whether individual variability reflects preexisting differences in the learning strategy used (i.e., some animals rely on stimulus–response associations early in the acquisition of a new instrumental task) or faster transition from goal-directed to habitual behavior in this subset of animals. However, the observation that all animals that display goal-directed behavior after moderate training develop habitual drug seeking with extended training (experiment 2) strongly suggests that drug seeking that is initially goal-directed becomes habitual after extensive drug experience.
It has been suggested that compulsive drug seeking can be characterized as a maladaptive stimulus–response habit in which the ultimate goal of the behavior has been devalued, perhaps through tolerance to the rewarding effects of the drug and that the persisting quality of these habits is central to drug addiction (Everitt and Robbins, 2005; Everitt et al., 2008). Additional studies are needed to determine whether individual differences in the rate of development of habitual drug seeking correlate with the propensity to develop compulsive drug seeking, as has been suggested (Everitt et al., 2008), or how it relates to other traits that are thought to predict the development of an addiction phenotype (Belin et al., 2008, 2009b).
Stimulus–response learning has been shown to be dependent on dorsolateral striatal function (Packard et al., 1989; Kantak et al., 2001). To determine whether the dorsolateral striatum is critical for the expression of habitual cocaine-seeking, behavior, lidocaine was used to transiently inactivate this region. In contrast to control animals who exhibited insensitivity to outcome devaluation after extended training, devaluation significantly reduced cocaine seeking in lidocaine-treated rats. Lidocaine has been used extensively to transiently inactivate the dorsal striatum and other brain regions (Van Golf Racht-Delatour and Massioui, 2000; Chang and Gold, 2004; Espina-Marchant et al., 2009). Although, lidocaine may have affected fibers of passage or areas adjacent to the injection site, the present results strongly suggest that disruption of the habit system by transient inactivation of the dorsolateral striatum caused established habitual drug seeking behavior to revert back under control of the goal-directed system. This observation is consistent with imaging studies showing a progressive engagement of the dorsal striatum only after extended drug history in monkeys (Porrino et al., 2004) and humans (Volkow et al., 2006) and with pharmacological studies showing an involvement of the dorsal striatum after prolonged training using second order schedules in rats (Vanderschuren et al., 2005; Fuchs et al., 2006; Belin and Everitt, 2008). These results extend those of Yin et al. (2004), who demonstrated that dorsolateral striatal lesions prevent the development of habitual responding for food reinforcers, and those of Hitchcott and colleagues (2007), who showed that in rats trained to express habitual responding for food, infusions of dopamine into the ventromedial prefrontal cortex restored goal-directed responding. Furthermore, they are consistent with the hypothesis that both, the habit and the goal-directed systems, function in parallel to control instrumental behavior and that disruption of the habit system reengages goal-directed processes that control behavior.
It has been argued that instrumental training engages two independent learning processes controlled respectively by action-outcome and stimulus–response associations, which operate concurrently (Dickinson, 1985). One important issue is whether the insensitivity to outcome devaluation seen after extended cocaine experience reflects strengthening of the control of behavior by the stimulus–response associations or, on the contrary, a disruption of the ability of response-outcome associations to guide behavior. In this regard, Schoenbaum and Setlow have suggested that chronic cocaine experience impairs the ability of response-outcome associations to guide behavior (Schoenbaum and Setlow, 2005). However, transient lidocaine inactivation of the dorsolateral striatum restores sensitivity to outcome value in cocaine experienced animals suggesting that the ability of the response-outcome association to control drug seeking remains intact after extended cocaine experience but is masked or inhibited by the habit system. In this regard, however, it should be noted that Schoenbaum and Setlow (2005) examined Pavlovian rather than instrumental responses which are known to be controlled by different brain systems (Belin et al., 2009a). Moreover, cocaine experience was limited and was experimenter-administered. These differences preclude a direct comparison between studies. However, the present finding that inhibition of the habit system can shift control of drug seeking to a goal-directed, voluntary process, has important implications for medications development given the proposed contribution of habitual behavior to the automaticity of drug seeking observed in addicts as well as to drug-craving and relapse to addiction (Tiffany, 1990).
In summary, after prolonged drug experience, intravenous cocaine seeking becomes a habit insensitive to changes in outcome value. Inactivation of the dorsolateral striatum causes established habitual cocaine seeking to revert back to control of the goal-directed system, suggesting that drug seeking habits may be reversible. The present studies suggest that engagement of the habit system requires more prolonged drug seeking experience than that afforded by most animal drug self-administration studies and are consistent with the notion that different brain systems are engaged as drug self-administration progresses from controlled drug use to abuse and to addiction. Finally, our studies suggest that the devaluation of the drug-taking link of a chained schedule of cocaine self-administration provides an effective model with which to study the neural substrates underlying goal-directed versus habitual drug seeking.
Footnotes
-
This work was supported by funding from the Intramural Research Program, National Institute on Drug Abuse, National Institutes of Health, Department of Health and Human Services.
- Correspondence should be addressed to Dr. Agustin Zapata, Integrative Neuroscience Section, National Institute on Drug Abuse Intramural Research Program, 333 Cassell Drive, Baltimore, MD 21224. Azapata{at}mail.nih.gov