Abstract
Drug self-administration models of addiction typically require animals to make the same response (e.g., a lever-press or nose-poke) over and over to procure and take drugs. By their design, such procedures often produce behavior controlled by stimulus–response (S-R) habits. This has supported the notion of addiction as a “drug habit,” and has led to considerable advances in our understanding of the neurobiological basis of such behavior. However, to procure such drugs as cocaine, addicts often require considerable ingenuity and flexibility in seeking behavior, which, by definition, precludes the development of habits. To better model drug-seeking behavior in addicts, we first developed a novel cocaine self-administration procedure [puzzle self-administration procedure (PSAP)] that required rats to solve a new puzzle every day to gain access to cocaine, which they then self-administered on an intermittent access (IntA) schedule. Such daily problem-solving precluded the development of S-R seeking habits. We then asked whether prolonged PSAP/IntA experience would nevertheless produce “symptoms of addiction.” It did, including escalation of intake, sensitized motivation for drug, continued drug use in the face of adverse consequences, and very robust cue-induced reinstatement of drug seeking, especially in a subset of “addiction-prone” rats. Furthermore, drug-seeking behavior continued to require dopamine neurotransmission in the core of the nucleus accumbens (but not the dorsolateral striatum). We conclude that the development of S-R seeking habits is not necessary for the development of cocaine addiction-like behavior in rats.
SIGNIFICANCE STATEMENT Substance-use disorders are often characterized as “habitual” behaviors aimed at obtaining and administering drugs. Although the actions involved in consuming drugs may involve a rigid repertoire of habitual behaviors, evidence suggests that addicts must be very creative and flexible when trying to procure drugs, and thus drug seeking cannot be governed by habit alone. We modeled flexible drug-seeking behavior in rats by requiring animals to solve daily puzzles to gain access to cocaine. We find that habitual drug-seeking isn't necessary for the development of addiction-like behavior, and that our procedure doesn't result in transfer of dopaminergic control from the ventral to dorsal striatum. This approach may prove useful in studying changes in neuropsychological function that promote the transition to addiction.
Introduction
In defining “addiction,” the Oxford English Dictionary (Oxford UP, 2017) cites an article from the Journal of the American Medical Association (1906), stating “it matters little whether one speaks of the opium habit, the opium disease or the opium addiction.” But is this correct? Is addiction equivalent to a “habit” (Tiffany, 1990; Everitt and Robbins, 2005, 2016; Lewis, 2015; Smith and Laiks, 2017)? In psychology, a habit refers to specific patterns of behavior controlled by stimulus–response (S-R) associations. Defining characteristics include automaticity, continued responding despite devaluation of the reward, as well as, “[increased] speed and efficiency, limited thought, rigidity, and integration of sequences of responses that can be executed as a unit” (Wood and Rünger, 2016, p292; see also Graybiel, 2008). Certainly, behaviors involved in consuming drugs, once obtained, can be automated and habitual (Tiffany, 1990). But what about behaviors involved in procuring (seeking) drugs? In fact, to procure drugs, addicts typically show considerable ingenuity and flexibility in their behavior, first, to acquire the money to purchase drugs, then to locate a possible drug source, and finally to negotiate a purchase, often under very challenging circumstances (Preble et al., 1969; Neale, 2002; Heather, 2017). Such motivated, goal-directed behavior requires daily solutions to unique problems and, by definition, is not habitual.
However, animal self-administration studies of addiction often use procedures that necessarily promote both drug-seeking and drug-taking S-R habits (Vandaele and Janak, 2017). When animals are trained to make an action (e.g., a lever press) to receive an intravenous injection of a drug (and an associated cue), they quickly acquire self-administration behavior (Weeks, 1962). It is generally agreed that such behavior is initially controlled by learned associations between the act (lever press) and the outcome [intravenous drug; i.e., cognitive act–outcome (A-O) associations], as well as motivated by Pavlovian relationships between drug cues and drug effects that trigger incentive motivation (S-O associations; Everitt and Robbins, 2005). At this stage, cocaine-seeking behavior is thought to be strongly controlled by dopamine (DA) activity in the ventral striatum (Robledo et al., 1992; Ito et al., 2004). However, with more prolonged drug experience, there can be a gradual transfer of control over behavior from A-O (and S-O) associations to S-R habits, as behavior becomes more automatic and stereotyped, and this is accompanied by increasing involvement of the dorsal (vs ventral) striatum in the control of drug-seeking behavior (Ito et al., 2002; Di Ciano and Everitt, 2004; Vanderschuren et al., 2005; Belin and Everitt, 2008; Zapata et al., 2010). Thus, behaviors that are initially goal-directed and “shaped and maintained by [their] consequences” (Skinner, 1971), “increasingly become elicited as stimulus–response habits” (Everitt, 2014, p. 2163; see also Dickinson, 1985). In animal studies, this occurs in part because the same response must be repeated over and over to procure the drug. In addition, the response is sometimes temporally separated from receipt of the reinforcer, as with interval schedules, which also promotes S-R habits (Dickinson, 1985, Dickinson et al., 1995; Everitt and Robbins, 2000; Wood and Rünger, 2016). However, unlike the act of drug taking, the creativity and resourcefulness addicts must show to procure drugs suggest that this behavior is not dominated by habit (Preble et al., 1969; Neale, 2002; Heather, 2017).
Therefore, our aim was to first develop a cocaine self-administration procedure in rats that better reflects the flexible problem-solving required of addicts to procure drugs. To do this, rats, like addicts, were required to solve a new problem every day to gain access to a drug; simply repeating stereotyped actions that worked in the past would not suffice. This precluded the development of habitual drug-seeking behavior. Our second aim was to then use this procedure to ask whether S-R habits, and the associated transfer of behavioral control from the ventral to dorsal striatum, are indeed necessary for development of addiction-like behavior in rats, as assessed using behavioral economic indicators of cocaine demand (Zimmer et al., 2012; Bentzley et al., 2013; Kawa et al., 2016).
Materials and Methods
Subjects
Male Long–Evans rats (n = 46; Charles River Laboratories), weighing 250–275 g on arrival, were individually housed in a temperature-controlled and humidity-controlled vivarium on a reverse light cycle. After acclimating to housing conditions for 1 week with food and water available ad libitum, rats were held at a steady body weight (∼90%; food restricted to ∼25 g/d) for an additional week before experimental procedures commenced. Behavioral testing occurred during the dark phase of the light cycle. All procedures were conducted according to a protocol approved by the University of Michigan Committee on Use and Care of Animals.
Apparatus
Behavioral training took place in standard Med Associates operant chambers (22 × 18 × 13 cm) enclosed within ventilated sound-attenuating compartments. All manipulanda or conditioned stimulus (CS) devices were purchased from Med Associates. For all tests, a cue light was located on the center–top of the front side of the chamber, with a single retractable lever with a flat edge positioned below and either on the left or right side of the light. This lever will be referred to as the “taking lever.” Chambers were always equipped with a red house light on the back wall of the chamber, directly opposite the cue light. A speaker used for presentation of a tone (see below) was positioned directly below the house light. The puzzle “seeking” manipulanda consisted of the following: (1) a response wheel that made an audible click every quarter rotation; (2) a fixed lever with a rolled edge; and (3) a nose port. These were positioned on the bottom–rear of the chamber (either to the left, to the right, or directly underneath the speaker). During initial training, a food cup was positioned on the front side of the chamber, below the cue light. Banana-flavored pellets were delivered to this food cup via a dispenser mounted outside the chamber. Both the food cup and dispenser were removed during drug self-administration. For drug self-administration, responses on the retractable lever activated a syringe pump (mounted outside the sound-attenuating box), which delivered intravenous cocaine to the tethered rat via tubing connected to the rat's catheter back port.
Experimental procedures
Food training.
The puzzles rats had to solve to gain access to a reward (food or drug; Figs. 1, 2; Table 1) were very demanding and thus considerable training was required for them to acquire the task. For this reason, rats, before catheter implantation, were initially trained to solve puzzles to gain access to a food reward. This was to better ensure that their catheters remained patent during the later prolonged cocaine self-administration phase of the experiment. Thus, rats were first familiarized with banana-flavored food pellets in their home cages for 2 d before experimental procedures began. Then, on a single pretraining day, rats were taught to retrieve the pellets from a food cup in the operant chambers according to a variable-time 30 s schedule (Fig. 1, Stage 1). During the next 2 d, rats lever-pressed on the taking lever, which remained extended, to receive a total of 60 pellets/session on a fixed ratio 1 (FR1) schedule. Finally, rats began training on the “seeking” manipulanda (response wheel, rolled-edge lever, nose port), which were separately introduced during 3 d blocks. Each session began with the house light off and then turned on after 60 s. The house light on signaled that the “seeking” manipulanda were active (later referred to as “Puzzle-ON”). On the first day of each block, a single response on the respective seeking manipulandum resulted in a tone presentation (1 s) and subsequent extension of the taking lever. Rats were then allowed to lever-press for pellets (with 1 s CS-light presentation) on an FR1 schedule for 1 min. Then, the house light was turned off (“Puzzle-OFF”) and the taking lever retracted, signaling a 20 s time-out period. The house light then turned back on, signaling the second trial (of eight trials total) and enabling the rats to activate the seeking manipulandum. Similar procedures were used on the second and third days of each training block, but the number of required responses on the seeking manipulandum was increased to three. After completing the training block, the seeking manipulandum was removed and replaced with another one. These food-training procedures were repeated until all rats learned the pattern of reward seeking and taking (completion of eight trials during 2 consecutive days).
Schedule of experimental procedures. The experimental procedures are divided into four stages: (1) food training (data not shown), (2) cocaine self-administration training, (3) the PSAP/IntA procedure and drug-seeking tests, and (4) final tests of addiction-like behavior. See Table 1 for a description of PSAP Puzzles #1–28.
Representation of the PSAP/IntA cocaine-taking procedure. The behavior required to solve Puzzle #15 is illustrated. The drug-seeking phase requires the completion of two distinct response sequences. In this example, the first response series requires the rat to make four presses on the rolled-edge lever. If successful (correct responses denoted by solid/thick lines), this is followed by a 1 s tone, and then the rat must complete the second response series, consisting here of two wheel turns. If this is also successful, the tone sounds again and this is followed by insertion of the taking lever and the transition to the drug-taking phase. However, if either the first or second response sequence during the drug-seeking phase is performed incorrectly (indicated by dashed lines), no tone is presented and the animal would have to reinitiate the first response series (i.e., restart the puzzle from the beginning). For example, for this puzzle, if a rat initially responded on either the nose-poke hole or wheel, the animal would not hear any tone until it figured out it had to make the required four responses on the rolled-edge lever. Furthermore, if, after four responses on the rolled lever resulted in a tone, the rat next responds on either the nose poke or makes another response on the rolled lever, then the puzzle would reset. However, after successful completion of the second response series, the taking lever would extend into the chamber and the rat is allowed to self-administer cocaine on an FR1 schedule, with no time-out, for 5 min. Each cocaine infusion is presented along with a CS light. After 5 min, the drug-taking lever retracts, the houselight is turned off, and a 25 min time-out period begins. After the 25 min time-out period, the houselight is turned back on and another trial of PSAP/IntA is initiated (10 trials or 7 h/d).
PSAP schedule of puzzles
In a subset of rats (n = 12; not used for cocaine self-administration), food training continued using puzzles similar to those described in Table 1 (8 trials/d as described above, ∼20 d total). Then, in counterbalanced order and separated by 3 additional days of puzzles, reward seeking under extinction conditions was measured either after satiating the rats (rats were given 10 g of banana-flavored pellets before the test) or without satiating the rats.
Surgery.
Following food training, rats were administered anesthesia (ketamine, 90 mg/kg, i.p.; xylazine, 10 mg/kg, i.p.) and underwent surgery for both (1) insertion of a catheter into the right jugular vein (as previously described; Crombag et al., 2000) and (2) implantation of bilateral guide cannulae aimed at either the nucleus accumbens (NAc) core [anteroposterior (AP), +1.8 mm; mediolateral (ML), ±1.6 mm; dorsoventral (DV), −5 mm from bregma; Singer et al., 2016] or the dorsolateral striatum (DLS; AP, +1.2 mm; ML, ±1.2 mm; DV, −3 mm from bregma; Vanderschuren et al., 2005). Guide cannulae were secured in place with surgical screws and dental acrylic. Both before surgery and during recovery, rats were administered saline (5 ml, s.c.), the antibiotic cefazolin (100 mg/kg, s.c.), and the analgesic carprofen (5 mg/kg, s.c.). For the remainder of the experiment, intravenous catheters were flushed daily with sterile saline containing 5 mg/ml gentamicin sulfate to minimize infection and prevent occlusions. Rats were allowed to recover from surgery for 7 d before cocaine self-administration training began.
Infusion criteria.
The acquisition of drug self-administration took place over the course of 9 d, with only the taking lever present (Fig. 1, Stage 2). During training, all rats were required to take the same amount cocaine hydrochloride (National Institute on Drug Abuse), as predetermined by an infusion criteria (IC) procedure (Saunders and Robinson, 2010). Accordingly, rats gradually increased cocaine taking from 10 to 40 infusions/d (IC10, 2 d; IC20, 3 d; IC40, 4 d; maximum 4 h/d). Each session started with a 1 min house light off period, followed by both the house light turning on and extension of the taking lever (the same one used for food training). Rats were allowed to lever-press for cocaine on an FR1 schedule (0.4 mg/kg/infusion in 50 μl delivered over 2.6 s), and cocaine infusions were paired with the presentation of a cue light. The CS remained illuminated for 20 s, during which time subsequent lever presses had no consequence. At the end of each session, after each rat completed the required number of infusions, the house light turned off and the rat was returned to its home cage. Rats that did not complete IC training within 9 d were excluded from the experiment (n = 2).
Behavioral economic tests.
After acquiring cocaine self-administration (n = 34; three replications), baseline (BL) behavioral economic parameters were measured using a within-session threshold procedure, as described previously (Oleson and Roberts, 2009; Oleson et al., 2011; Bentzley et al., 2013; Kawa et al., 2016). Briefly, during five 110 min within-session threshold tests (one per day), rats were allowed to press the taking lever to receive cocaine. However, the dose of cocaine was decreased every 10 min according to a quarter logarithmic scale (383.5, 215.6, 121.3, 68.2, 38.3, 21.6, 12.1, 6.8, 3.8, 2.2, and 1.2 μg/infusion), without any time-out periods. During these tests, the cue light was presented during each drug infusion, while the house light was on for the entire session (except during the first 60 s). As described previously (Bentzley et al., 2013; Kawa et al., 2016), the drug-taking data were used to generate demand curves via a focused-fitting approach (typically using the final 3 d of stable responding on the threshold procedure). Accordingly, for each rat, BL measures were obtained for Pmax (price of drug that elicited maximum responding), QO (preferred level of drug consumption when the price was negligible), and α (demand elasticity, normalized to QO).
Following the threshold procedure, rats were tested on a within-session punishment procedure for 3 d. As described previously (Bentzley et al., 2014; Kawa et al., 2016), during this test the dose of drug available for self-administration remained constant (38.3 μg/infusion), but the cost of drug gradually increased by imposing an adverse consequence for taking it (a footshock; 0.5 s). Briefly, after a 20 min period of cocaine administration (FR1) without punishment, the level of shock delivered concurrently with a drug infusion increased every 10 min (0.10, 0.13, 0.16, 0.20, 0.25, 0.32, 0.40, 0.50, 0.63, 0.79 mA). To normalize for individual variation, data were analyzed as the maximum current each rat was willing to endure to defend its preferred level of cocaine intake (Max Charge).
Finally, after prolonged cocaine self-administration using an intermittent access procedure (IntA; Fig. 1, Stage 4), but before the saline-induced and cocaine-induced reinstatement tests, rats were once again tested on the within-session threshold (2 d) and punishment (2 d) behavioral economic procedures. This was to assess how cocaine demand changed from BL, as a function of PSAP/IntA experience.
Puzzle self-administration procedure with intermittent access to cocaine.
Following initial behavioral economic testing, rats self-administered cocaine for 4 weeks using a puzzle self-administration procedure (PSAP) specifically developed to maintain behavioral flexibility in drug-seeking behavior (Fig. 1, Stage 3; Fig. 2; 5 d/week; maximum, 10 trials or 7 h per session; average, 9.41 ± 0.095 completed trials across all sessions). Similar to standard IntA self-administration protocols (Zimmer et al., 2012; Kawa et al., 2016), rats were allotted 5 min drug-available periods (FR1 on the extended taking lever; house light on), alternating with 25 min drug-unavailable time-out periods (taking lever retracted; house light off). When drug was available, each lever press resulted in a cocaine infusion (0.4 mg/kg/infusion in 50 μl of 0.9% sterile saline, delivered over 2.6 s; no postinfusion time out) along with cue light presentation. However, in contrast to previous studies, rats needed to first complete a drug-seeking task on each trial (i.e., solve a puzzle; Table 1), before gaining access to the taking lever. During the first trial, and following each time-out period, puzzle availability (Puzzle-ON; and thus the initiation of drug seeking) was signaled by the house light turning on. Since the puzzle manipulanda (response wheel, rolled-edge lever, nose port) were always present, some interaction did occur during Puzzle-OFF periods (e.g., time-outs). However, there was significantly more drug-seeking during Puzzle-ON than during Puzzle-Off periods (comparison of drug-seeking rates; see Results; Fig. 4).
During each self-administration day, rats learned to solve a single unique puzzle to gain access to the taking lever. Across the entire experiment, puzzles were not repeated (except for “representative” Puzzle #15, which was used during microinjection procedures described below). The order of puzzle testing was kept constant for all rats (Table 1). Also, puzzles gradually became more difficult as the experiment progressed, requiring an increasing number of drug-seeking responses (Puzzles/Days 1–3, one response required; Puzzles/Days 4–6, two responses required; Puzzles/Days 7–13, three to five responses required; Puzzles/Days 14–20, five to six responses required). Puzzle difficulty increased gradually because we found in pilot studies that the task was too difficult for the rats to master otherwise. Aside from Puzzles #1–3, which required only a single behavioral response for rats to gain access to the drug-taking lever, the remainder of the puzzles required rats to use two of the three manipulanda (two series of responses). Successful completion of each response series resulted in the presentation of a tone (1 s). For example, Puzzle #15 (Fig. 2) first required rats to press the rolled-edge lever four times in a row (essentially FR4), and this resulted in a tone presentation. This also signaled that responding on the rolled-edge lever was no longer required and that the rat must next respond on a different manipulandum (in this example, the wheel). Then, after two wheel turns, the tone would sound again, followed by extension of the taking lever (beginning drug-available and Puzzle-OFF period, while the house light remained on).
Importantly, however, during the Puzzle-ON period, mistakes on the puzzle resulted in the rat having to restart the puzzle from the beginning. Thus, according to representative Puzzle #15, extra presses on the rolled lever (e.g., five presses instead of four), or nose-poking instead of turning the wheel, would have “re-set” the puzzle from the beginning, again requiring four responses on the rolled lever. Despite the difficult nature of the puzzles, rats did improve drug-seeking performance across trials during a given session (see Results). Even so, to ensure that all rats got equal cocaine exposure across days, failure to solve the puzzle after a given period (Trial 1, 10 min; Trials 2–10, 15 min) resulted in the next drug-seeking response giving access to the taking lever, turning the puzzle off for that trial. Finally, because every rat differed in the amount of time taken to solve the puzzle, the amount of time between each drug-available period also differed (Puzzle-ON time + 25 min time-out), adding an extra degree of drug intermittency when compared with other IntA experiments (Kawa et al., 2016).
Microinjections.
The ability of DA signaling to regulate drug seeking was assessed after 4 weeks of PSAP/IntA cocaine self-administration experience. Using a within-subject procedure, rats received microinjections of either vehicle or the DA receptor antagonist cis-(Z)-flupenthixol (0, 5, or 15 μg in 0.9% sterile saline; 0.5 μl/side/min, plus 1 min diffusion) into the NAc core (n = 8) or the DLS (n = 7), similar to previous reports (Di Ciano and Everitt, 2004; Vanderschuren et al., 2005; Murray et al., 2014). While rats were not divided according to addiction criteria for this analysis (described below), during prolonged PSAP/IntA self-administration on average all rats increased drug seeking across sessions and there were no differences in drug seeking between rats used in the DLS and NAc groups.
Microinjections were performed once every 3 d (doses counterbalanced, Latin-square design), with additional PSAP/IntA cocaine self-administration (novel puzzles; Table 1) on the 2 d between the intracranial infusions. During microinjection test days, drug seeking was tested on representative puzzle #15 (starting 5 min after injection), allowing for easy comparison of behavior across doses. Also, on these test days, responding on the taking lever resulted in intravenous saline infusions, rather than cocaine, and PSAP/IntA testing was limited to ∼3 h. Because some rats stopped drug seeking under these experimental conditions (flupenthixol, extinction), behavior was only analyzed for the first trial.
Cocaine-induced reinstatement.
After completing the series of microinjections, rats were allowed to self-administer cocaine according to the PSAP/IntA schedule for an additional 2 d (novel puzzles). Then, following additional behavioral economic testing (Fig. 1, Stage 4; 2 d threshold; 2 d punishment) and 2 more cocaine PSAP/IntA days (novel puzzles; followed by 2 d rest), rats were tested for cocaine-induced reinstatement of drug pursuit using procedures described previously (Deroche et al., 1999; Kawa et al., 2016). Briefly, tests were conducted over 2 d with the puzzle manipulanda removed. Each day began with the house light initially off (1 min) and then turned on for the remainder of the session. Next, on both sessions, the taking lever was extended, and rats underwent extinction for 90 min. After this period, rats received infusions of either intravenous saline (Day 1; 25, 50, 100, 200 μl) or cocaine (Day 2; 0.2, 0.4, 0.8, 1.6 mg/kg; same volume as corresponding saline injections) in 30 min intervals.
Extinction and cue-induced reinstatement.
Rats underwent an extinction procedure (2 h/session) for 7 d after the cocaine reinstatement test. Consistent with other testing conditions, the house light was turned on 1 min after rats were placed in the operant chambers. During extinction, the drug-seeking manipulanda were removed, and the taking lever was extended throughout the session. Responses on the taking lever were without consequence. Next, the ability of the previously drug-paired cue light to reinstate pursuit of drug was tested, using a conditioned reinforcement procedure. Accordingly, rats were again tested under extinction, but each lever-press was reinforced with brief illumination of the cue light that had been previously paired with cocaine injections, along with activation of the infusion pump (2.6 s; no tubing attached).
Verification of cannula placement
At the conclusion of the experiment, all rats were deeply anesthetized (sodium pentobarbital; 60 mg/kg, i.p.), and their brains were extracted and placed in formalin. Brains were later frozen, sectioned using a cryostat (40 μm), and stained (cresyl violet) to confirm cannula tip placements within either the NAc core or DLS (see Fig. 8b,c). Rats lacking correct bilateral cannula placements were not included in the analyses. During the experiment, catheter patency was tested using brevital (0.1 ml, i.v.) after Puzzles #20 and #26, as well as before being killed.
Experimental design and statistical analysis
As described elsewhere, male Long–Evans rats (n = 46) were trained on the various behavioral procedures. Microinjection procedures (injection site, dose) were counterbalanced according to principles of Latin-square design. One-way or two-way repeated-measures ANOVAs were used for analyzing all behavioral measures (Bonferroni corrections were used to control for multiple comparisons), except for responding during devaluation and extinction, for which paired or unpaired t tests were used. All statistics were performed using GraphPad Prism.
Individual variation in addiction-like behavior was analyzed by determining whether rats met specific “addiction criteria,” as described previously (Deroche-Gamonet et al., 2004; Kawa et al., 2016), and similar to criteria used to assess human substance-abuse disorder in the Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition (DSM-5; American Psychiatric Association, 2013). First, we determined which rats displayed the following: (1) the greatest (top one-third) motivation for drug (Pmax), (2) drug taking despite adverse consequences (Max Charge endured), and (3) greatest continued pursuit of drug despite it not being available (during extinction). Rats that met two or three of these benchmarks (2–3 criteria) were classified as positively meeting addiction-like criteria (n = 5), and the behaviors of these rats were compared with rats that met none or one addiction criterium (0–1 criteria; n = 10). This distribution observed in Long–Evans rats was similar to that in other strains, including Sprague Dawley rats (Kawa et al., 2016). Drug seeking described in the current results was not used as a standard for determining whether rats met none or one criterium or met two or three criteria because it was not included in previous reports (Deroche-Gamonet et al., 2004; Kawa et al., 2016). Some rats were not tested beyond the PSAP/IntA procedure or did not complete the entire experiment (i.e., through the cue-induced reinstatement test), and were thus excluded from the analyses of individual variation in motivation.
Importantly, the PSAP/IntA procedure is not meant to be a complete and all-encompassing animal model of addiction. For example, it is well known that, when given the opportunity to obtain an “alternative reinforcer” to drug, animals and people will decrease their drug use (Higgins, 1997; Venniro et al., 2017; for review, see Heather, 2017). This was not modeled in the present manuscript. We also did not incorporate measurements of impulsivity into the PSAP/IntA procedure (Dalley et al., 2011). Furthermore, like previous reports (Deroche-Gamonet et al., 2004; Kawa et al., 2016), we cautiously refer to the rats as displaying various “addiction-like” behaviors. While we and others believe that the behavioral economic and reinstatement techniques used have criterion validity (Epstein et al., 2006; MacKillop, 2016), the rats are not “addicts” and the complexity of human behavior obviously extends well beyond what can be modeled in animals. That said, the lack of preclinical studies that have been translated into acceptable treatments for substance abuse may, in part, be due to incomplete or inadequate modeling of the human condition in animals. While it is difficult to mimic in rats the complex conduct of a “street addict” procuring drugs, to the best of our knowledge PSAP/IntA is the first procedure that attempts to model this behavior in animals.
Results
Acquisition of cocaine self-administration
Rats were first trained to lever-press for food and then to self-administer cocaine (data not shown). Rats readily increased responding for cocaine across training days (IC procedure; F(2,66) = 56.8, p < 0.0001; one-way repeated-measures ANOVA comparing lever pressing across days; p < 0.0001, taking lever responses on IC40 vs IC10 or IC20; p < 0.05, IC20 vs IC10; Bonferroni). Similarly, rats spent more time self-administering drug when given the opportunity to take more cocaine (F(2,66) = 219.1, p < 0.0001; one-way repeated-measures ANOVA comparing session length across days; p < 0.0001, IC40 vs IC10 or IC20, IC20 vs IC10, Bonferroni). Rats that did not administer 40 cocaine infusions on the final day of this procedure were excluded from further testing (n = 2).
Puzzle self-administration procedure with intermittent access to cocaine
Drug seeking
After successfully learning to lever-press for cocaine, rats were allowed to self-administer cocaine for 20 d using the PSAP/IntA procedure (n = 34). PSAP/IntA was designed to preclude the development of habitual drug-seeking across testing days. Accordingly, on each day rats needed to solve a single puzzle, for a total of 10 trials each day. It was possible, however, that rats were not learning to solve these puzzles, but were instead responding randomly on the drug-seeking manipulanda. To assess this possibility, we measured the rats' within-session puzzle performance across days. Regardless of puzzle difficulty, rats improved their puzzle performance between the start and the end of testing each day (Fig. 3a, Puzzles 4–6; F(2,66) = 4.11, p = 0.02; Fig. 3b, Puzzles 7–13; F(2,66) = 20.23, p < 0.0001; Fig. 3c, Puzzles 14–20; F(2,66) = 17.17, p < 0.0001; one-way repeated-measures ANOVAs), making a higher percentage of correct responses late in each session (Trials 4–6 and/or 7–10; p < 0.05–0.0001, Bonferroni) compared with earlier that day (Trials 1–3). Despite this improvement, at the end of each session rats still only made correct responses ∼45% of the time, indicating that the puzzles were quite difficult; rats continued to struggle to solve the puzzles each day, and more often than not they had to restart puzzles within a session. In addition, there was no improvement at the start of each session between days of the procedure. This indicates the puzzles were sufficiently demanding to preclude the development of stereotyped, routine, or “habitual” behavior, but instead reflected motivated, goal-directed behavior throughout the PSAP/IntA schedule. This is consistent with increases in motivation to solve the puzzles to gain access to drug, with increasing puzzle and drug experience (see below).
Improved puzzle-solving during the PSAP/IntA procedure. a–c, Regardless of puzzle difficulty (a, 2 responses required; b, 3–5 responses required; c, 5–6 responses required), rats improved their performance during daily sessions (n = 34; †p < 0.0001–0.05), making significantly more correct responses on Trials 7–10 compared with Trials 1–3 (p < 0.0001–0.05) or 4–6 (Puzzles 14–20; p < 0.05). Graphs show mean ± SEM.
Interestingly, it is possible that the rats' behavior during PSAP/IntA may have been governed by a series of semiautomated conscious subgoals ruled by if–then conditions (implementation intentions; Sheeran et al., 2005; Wood and Rünger, 2016). This phenomenon has been referred to as a strategic automaticity and this differs from the unconscious automaticity commonly associated with habits (Gollwitzer and Schaal, 1998). In sum, it is not proficiency that is essential, but it is instead important that responding persists and must remain flexible as the rats make mistakes.
We next assessed how drug seeking changed during prolonged PSAP/IntA cocaine self-administration. Because the difficulty of the puzzles increased as the experiment progressed (Table 1), drug seeking was calculated as rate of responding (puzzle manipulanda activations normalized to the total amount of time needed to solve the puzzle; Puzzle-ON) and then compared with rate of responding during time-out periods (25 min; Puzzle-OFF). Across the weeks of self-administration, rats significantly increased their rate of drug-seeking behavior (Fig. 4a, Puzzle-ON black circles, Puzzle-OFF white circles; two-way repeated-measures ANOVA comparing Puzzle-ON vs Puzzle-OFF responding across all trials; Effect of Session, F(3,99) = 3.92, p = 0.01; Effect of Puzzle-ON/OFF, F(1,33) = 35.06, p < 0.0001; Interaction between Session and Puzzle-ON/OFF, F(3,99) = 3.36, p = 0.02; Puzzle-ON Days 14–20 vs Days 1–3 or 4–6, p < 0.0001–0.01, Bonferroni). Drug seeking was always greater during Puzzle-ON periods relative to Puzzle-OFF time-outs (Fig. 4a; p < 0.0001–0.05, Bonferroni).
Drug-seeking behavior during PSAP/IntA. a, To determine changes in drug-seeking behavior with increasing PSAP/IntA experience (Session), while accounting for the increased number of puzzle responses required, behavior was analyzed as a rate (seeking responses per minute). a shows that the rate of drug seeking increased across 4 weeks of cocaine self-administration (Puzzle-ON, black circles; †p < 0.0001–0.01, seeking Days 14–20 vs 1–3 or 4–6). The rate of drug seeking was significantly greater during Puzzle-ON periods, compared with Puzzle-OFF time-outs (*p < 0.0001; white vs black circles; p < 0.0001–0.05, comparing each day). In a subset of rats (n = 6), drug seeking decreased when the tones that guided seeking behavior were omitted (No Tone, cross-hatched square; *p < 0.05 vs same rats during Puzzle-ON for Sessions 14–20). b, Mistakes made while drug seeking on each puzzle trial forced the rats to restart the puzzle from the beginning. Puzzles became harder to solve across sessions and, accordingly, the number of times the rats restarted each puzzle also increased (†p < 0.0001). n = 34. Graphs show mean ± SEM.
When rats made mistakes while trying to solve a puzzle, they were forced to restart the puzzle from the beginning (i.e., they had to again perform the first required behavioral response series; Fig. 2). Puzzles became harder to solve across sessions (Table 1) and rats had difficulty solving later puzzles. Accordingly, the number occasions on which rats were forced to restart the puzzles increased across sessions (Fig. 4b; F(3,99) = 54.1, p < 0.0001). Importantly, despite this increase in failure rate, rats increased the rate at which they tried to solve the puzzles (Fig. 4a), and they gradually got better at solving the puzzle during each session (Fig. 3). The rats' perseverance in drug seeking, and increased rate of responding, as the puzzles became progressively more difficult, may reflect increasing motivation to procure drug, which is consistent with data from the behavioral economic measures of cocaine demand (see below). Furthermore, given that they were required to constantly adjust their behavior, it would be expected that drug seeking would never become habitual, which is supported by further analyses below.
On a single test day, after 20 d of PSAP/IntA experience, the tones that normally signaled successful completion of each response chain were omitted in a subset of rats. Note that these tones were neither paired with drug administration (they were not a drug CS) nor acted as a discriminative stimulus signaling drug availability. Indeed, >50% of the time a tone did not precede extension of the drug-taking lever, because more often than not the rats made a mistake after completing the first response chain, and had to restart the puzzle. Thus, the tones should not be interpreted as influencing behavior through properties of conditioned reinforcement, but instead they are “guide tones” aiding in the performance of drug-seeking behavior. In contrast to the tones, the drug CS was the light cue paired with cocaine injections (and which was used in the test of reinstatement), and extension of the drug-taking lever was the discriminative cue that signaled drug availability. That said, omission of the guide tone significantly decreased the rate of drug seeking to the level seen during Puzzle-OFF periods (Fig. 4a, cross-hatched square, subset of rats; t(5) = 2.61, p = 0.048; paired t test, Days 14–20 vs no tone responding). This indicates that these tones, which guided puzzle performance but were not paired with drug delivery, nevertheless powerfully motivated drug-seeking behavior. The nature of the psychological processes that allowed the tones to guide and motivate behavior deserve further investigation. Finally, because drug seeking ceased in the absence of the tones, rats did not gain access to the taking lever during this specific test session, and thus drug self-administration was not measured.
Last, in the drug-naive subset of rats trained to seek and take sucrose pellets using a similar PSAP schedule (∼20 d), devaluation of the reinforcer via satiation significantly decreased the pursuit of sucrose (reward-seeking puzzle responses, t(11) = 3.04, p = 0.017; food receptacle entries, t(11) = 2.36, p = 0.038; data not shown).
In summary, the PSAP/IntA procedure resulted in the following five findings: (1) motivation to solve the puzzles increased, as indicated by an increase in rate of responding and response perseverance during the Puzzle-ON periods, even as puzzle difficulty increased (Fig. 4); (2) correct behavioral responding never improved beyond 35–45%, and thus responding could never become automatized, as more often than not they had to restart the puzzle; (3) rats could withhold responding when the puzzle was off and the guide tones were absent (compare seeking when the puzzle was on vs off; Fig. 4a); (4) the tones may have had motivational value that promoted continued drug seeking, because their omission decreased seeking behavior to levels seen during Puzzle-OFF conditions (Fig. 4a); and (5) the use of the PSAP procedure with a sucrose reward prevented the development of S-R habits, as responding remained sensitive to devaluation of the reward. These data support the claim that drug seeking never became “automatized” or habitual under PSAP/IntA conditions, and that seeking behavior remained sensitive to its consequences.
Drug taking
During the PSAP/IntA schedule, after rats correctly solved the puzzle on a given trial, they then gained access to the cocaine-taking lever for 5 min on an FR1 schedule, before a 25 min time-out period ensued. As shown in Figure 5a, on each trial, most cocaine infusions were taken during the first minute of the 5 min period that rats had access to the drug, and escalation of cocaine use occurred during this first minute of drug availability across weeks of self-administration (effect of Sessions 1–3 vs 14–20, F(1,33) = 35.46, p < 0.0001; effect of trial, F(2,66) = 6.39, p = 0.029; session × trial interaction, F(2,66) = 8.25, p = 0.0006; p < 0.0001, any trial during Days 1–3 vs any trial for Days 14–20; Bonferroni). Furthermore, during early PSAP/IntA sessions (Days 1–3), rats also increased their intake of cocaine across trials (during a session), taking more cocaine during Trials 7–10 compared with either Trials 1–3 or 4–6 (p < 0.0001–0.01; comparing first minute of drug availability per trial; Bonferroni).
Drug-taking behavior during PSAP/IntA. a, Number of cocaine infusions during each min of the 5 min drug-available period within daily sessions (Daily Trials 1–3, 4–6, and 7–10, horizontal axis) as a function of days of PSAP/IntA experience (open circles, the first 1–3 d of PSAP/IntA experience and closed circles after 14–20 d of PSAP/IntA experience). Although cocaine was available for a total of 5 min (FR1 schedule) after each puzzle completion on each trial, most of the infusions were self-administered during the first minute of drug access (compare minute 1, 2, 3, 4, and 5 during each of the trial blocks). During the first minute of cocaine access, there was a significant increase in infusions administered both across sessions (Days 1–3 vs 14–20; †p < 0.0001) and across trials for a given session (*p < 0.05). There was also a significant effect of trial number for Sessions 1–3; animals took more cocaine in the first minute of availability on Trials 4–6 and 7–10, relative to Trials 1–3 (p < 0.001–0.01). Rats also escalated cocaine intake for minutes 2–4 of drug availability during Sessions 14–20, relative to Sessions 1–3 (p < 0.05). b, Average cocaine intake on the first daily trial across four PSAP/IntA blocks. Rats escalated their cocaine intake across the 4 weeks of PSAP/IntA (†p < 0.0001). n = 34. Graphs show mean ± SEM.
We did not directly assess whether drug-taking behavior became habitual. However, even after escalation of intake, most drug-taking behavior consisted of taking 4–5 infusions in the first minute of drug availability and then stopping (presumably because brain levels of the drug rapidly reached QO; see below). It is hard to imagine that these 4–5 actions during each drug-available period would transition from control by A-O associations to S-R associations, because the latter typically requires overtraining. Furthermore, if drug taking was completely habitual, then we might have expected rats to continuously self-administer cocaine throughout the 5 min drug-available period. Under this scenario, rats would have continued responding on the taking lever even if they did not “desire” or “want” drug, similar to how overtraining rats to self-administer cocaine results in consistent drug-taking responses even if cocaine has been devalued (Miles et al., 2003). This, however, was not the case; rats took most of their cocaine infusions during the first minute of drug availability. This restricted-pattern drug administration suggests that drug taking, similar to drug seeking, was not habitual. Nevertheless, we never attempted to devalue cocaine or otherwise test whether drug taking came to be controlled by S-R associations, so we cannot address that issue here. That being said, rats did continue to show escalated cocaine intake beyond the first minute of drug availability during late PSAP/IntA sessions (Days 14–20; effect of session across trials: second minute, F(1,33) = 6.23, p = 0.02; third minute, F(1,33) = 5.78, p = 0.02; fourth minute, F(1,33) = 4.68, p = 0.04; fifth minute, F(1,33) = 3.96, p = 0.05).
Rats also escalated their total daily cocaine intake across the weeks of PSAP/IntA self-administration (F(3,99) = 4.94, p = 0.0031, one-way repeated-measures ANOVA; data not shown), responding more on the taking lever during later sessions (Days 14–20) compared with earlier sessions (Days 1–3 or 4–6; p < 0.01–0.05, Bonferroni). This escalation of cocaine taking was particularly evident during the first daily trial (Fig. 5b; F(3,99) = 11.44, p < 0.0001, one-way repeated-measures ANOVA of infusions; p < 0.0001–0.05, Days 14–20 vs 1–3 or 4–6; p < 0.01, Days 7–13 vs 1–3; Bonferroni). The sensitization of these responses, both within and across sessions, suggests that with prolonged PSAP/IntA experience, the rats developed one feature of addiction-like behavior, escalation of intake, consistent with previous reports (Kawa et al., 2016; Allain et al., 2017; Pitchers et al., 2017).
Tests for addiction-like behavior
A major goal of this study was to develop an animal model of substance-abuse disorder that better reflects the flexible drug-seeking behavior that typically characterizes the behavior of drug users as they transition to addiction. When modeling addiction-like behavior in animals, it is important to consider that not everyone who experiments with drugs goes on to compulsively abuse drugs. Furthermore, the DSM-5 attempts to quantify the severity of substance-use disorders by determining the number of symptoms individuals suffer from. To model this individual variation in animals, we first identified rats meeting either the most (2–3 criteria rats; n = 5) or fewest (0–1 criteria rats; n = 10) criteria of addiction, as previously described by Deroche-Gamonet et al. (2004) and in our recent paper using the IntA procedure (Kawa et al., 2016; see Experimental design and statistical analysis). Of course, animals in the top third on a measure used as an addiction “criteria” will score high on that measure after PSAP/IntA. The relevant question for this analysis concerns the extent to which motivation for cocaine changed in 0–1 criteria rats versus 2–3 criteria rats. That is, did these subgroups always differ on measures of cocaine demand or were they similar before PSAP/IntA experience but come to differ only as a result of PSAP/IntA experience; did the experience change them differently. The results indicate the latter.
Individual variation in seeking and taking cocaine
During the initial acquisition of cocaine self-administration (IC procedure), there were no differences between 0–1 and 2–3 criteria rats in lever-presses made (Fig. 6a; effect of group, F(1,13) = 0.061, p = 0.81; effect of IC, F(2,26) = 50.92, p < 0.0001; group × IC interaction, F(2,26) = 0.36, p = 0.70), and in fact the 2–3 criteria rats were on average slower to obtain 20 or 40 infusions (Fig. 6b; effect of group, F(1,13) = 17.78, p = 0.001; effect of IC, F(2,26) = 122.00, p < 0.0001; group × IC interaction, F(2,26) = 3.81, p = 0.035; Bonferroni post hoc tests, 0–1 vs 2–3 criteria rats for IC20 or IC40, p < 0.001–0.01). Next, we reanalyzed the PSAP/IntA self-administration data as a function of addiction criteria. The 0–1 and 2–3 criteria rats did not differ in their rate of drug-seeking behavior before IntA experience (responses/min while solving puzzles), but with prolonged PSAP/IntA experience the rate of drug seeking significantly increased in 2–3 criteria rats, but not in 0–1 criteria rats (Fig. 6c, Puzzle-ON; effect of session, F(1,13) = 15.22, p = 0.0018; effect of group, F(1,13) = 1.09, p = 0.32; session × group interaction, F(1,13) = 10.43, p = 0.0066; PSAP/IntA Days 1–3 vs 14–20, p < 0.01 for 2–3 criteria rats; 0–1 vs 2–3 criteria rats, p < 0.05 during PSAP/IntA Days 14–20). In contrast, there were no differences between 0–1 and 2–3 criteria rats in drug seeking during the 25 min time-out periods, suggesting that all rats readily discriminated between drug-available and drug-unavailable periods (data not shown; PSAP/IntA Days 1–3 vs 14–20; effect of group, F(1,13) = 0.24, p = 0.63; effect of session, F(1,13) = 2.45, p = 0.14; group × session interaction, F(1,13) = 1.97, p = 0.18).
Individual variation in drug self-administration during PSAP/IntA. Rats were divided into two groups, either meeting 0–1 (n = 10) or 2–3 (n = 5) addiction criteria, as defined in the Materials and Methods. a, b, During the acquisition of self-administration using the IC procedure, all rats increased responding for cocaine (a, †p < 0.0001). However, 2–3 criteria rats were slower at completing either 20 or 40 drug infusions (b, *p < 0.01, effect of group; †p < 0.0001, effect of IC; p < 0.001–0.01, 0–1 vs 2–3 criteria rats for either IC20 or IC40). c, Rate of drug seeking during PSAP as a function of addiction criteria. The 0–1 and 2–3 criteria groups did not differ in the rate of drug seeking before PSAP/IntA experience (Sessions 1–3). However, after PSAP/IntA experience (Sessions 14–20) rats meeting 2–3 addiction criteria showed a significant increase in drug seeking, while rats meeting 0–1 criteria did not (†p < 0.01, Days 1–3 vs 14–20 PSAP/IntA for 2–3 criteria rats; *p < 0.05, 0–1 vs 2–3 criteria rats during PSAP/IntA Days 14–20; Bonferroni), d, Rats meeting 2–3 addiction criteria escalated drug intake (†p < 0.01, PSAP/IntA Days 1–3 vs 14–20 for 2–3 criteria rats), whereas rats meeting 0–1 criteria did not significantly escalate cocaine intake. Graphs show mean ± SEM.
Regarding the number of cocaine infusions taken across days of the PSAP/IntA procedure, there was a significant effect of early versus late sessions (Fig. 6d; effect of session, F(1,13) = 17.89, p < 0.0010; effect of group, F(1,13) = 0.081, p = 0.78; session × group interaction, F(1,13) = 2.53, p = 0.14). The 2–3 and 0–1 criteria rats did not differ in drug intake early, but by the end of PSAP/IntA, the 2–3 criteria rats significantly escalated their cocaine intake (p < 0.01, Sessions 1–3 vs 14–20), while 0–1 criteria rats did not, although total intake did not differ significantly. Therefore, during late PSAP/IntA sessions (Days 14–20), all rats took approximately the same amount of cocaine. It seems that while rats differed in motivation to seek cocaine, they did not in the end differ in the amount of drug they preferred to take when it was available. Supporting this idea, regardless of the addiction-criteria group, PSAP/IntA experience did not significantly change the rats' preferred level of drug consumption when the price was negligible (QO; Fig. 7c; effect of BL vs post-PSAP/IntA tests, F(1,13) = 1.74, p = 0.21; effect of group, F(1,13) = 0.39, p = 0.54; BL/post-test × group interaction, F(1,13) = 0.00024, p = 0.99; calculations derived from the behavioral economic “threshold” procedure). Together, these results suggest that while individual variation exists in motivation to seek cocaine after PSAP/IntA experience, the preferred brain concentration of cocaine, which is what is defended when cost increases and is measured by QO, did not differ between the groups and did not change with increasing drug experience. There appears to be a dissociation, therefore, between whatever desired drug effects determine QO and the degree to which rats are motivated to obtain such effects, as we have reported previously (Kawa et al., 2016).
Individual variation in motivation for drug. This figure summarizes changes in measures of cocaine demand and other addiction-like behaviors, as a function of PSAP/IntA experience [BL vs after PSAP/IntA experience (Post)], and as a function of addiction criteria met (0–1 vs 2–3 criteria). a, Pmax is defined as the maximum amount rats were willing to pay (in effort) to maintain their preferred level of drug consumption. Pmax was increased in both 0–1 and 2–3 addiction criteria rats, but the magnitude of the increase was greater in the 2–3 criteria rats (†p < 0.001, BL vs post-PSAP/IntA test for 2–3 criteria rats; *p < 0.001, 0–1 vs 2–3 criteria rats during post-PSAP/IntA test). b, Elasticity of the demand curve (α) refers to how readily responding declines as cost (in effort) increases, and is normalized to the preferred level of consumption (QO) for each rat. Following PSAP/IntA experience, all rats showed a decrease in α (that is, the demand curve became less elastic), indicating insensitivity to changes in drug price (†p < 0.01), and there were no group differences. c, There were no changes in the preferred level of cocaine consumption when cost was negligible (QO). d, Following PSAP/IntA, the 2–3 criteria rats, compared to 0–1 criteria rats, were more willing to endure an electric shock to maintain their preferred level of cocaine consumption, although these groups did not differ before PSAP/IntA experience (*p < 0.05, 0–1 vs 2–3 criteria rats during post-PSAP/IntA test). e, Compared with rats meeting 0–1 addiction criteria, rats meeting 2–3 criteria were more likely to continue responding on the taking lever during a single 90 min extinction session (*p < 0.05). f, During a test for cocaine-induced reinstatement, rats received one noncontingent infusion of cocaine (0/Ext, 0.2, 0.4, 0.8, 1.6 mg/kg) every 30 min. These infusions significantly increased responding on the taking lever (which had no consequence), regardless of addiction-criteria group (†p < 0.01). g, After the test for cocaine-induced reinstatement, rats underwent seven daily 2 h extinction sessions. The 2–3 criteria rats responded more on the lever than the 0–1 criteria rats during extinction (*p < 0.0001) and there was also a significant effect of session (†p < 0.05; 2–3 criteria rats were different from 0–1 criteria rats on Ext–Ext2, but not Ext3–Ext7, Bonferroni). h, Next, on the test for cue-induced reinstatement (2 h), lever presses resulted in cue-light presentation and concurrent activation of the infusion pump (not connected to rat) for 2 s. While all rats displayed cue-induced reinstatement, this effect was greatest in rats meeting 2–3 addiction criteria (†p < 0.001–0.05, Ext7 vs CR for either 0–1 or 2–3 criteria rats; *p < 0.001, 2–3 vs 0–1 criteria rats for CR test). Rat criteria: 0–1 (n = 10) or 2–3 (n = 5). Graphs show mean ± SEM.
Behavioral economic assessment of changes in cocaine demand as a function of PSAP/IntA experience
Cocaine demand was assessed both before (BL) and after (post-test) prolonged PSAP/IntA self-administration experience. During the “threshold” test, the cost of cocaine was progressively increased by increasing the number of lever presses required to maintain the preferred brain level of cocaine. One measure of motivation for cocaine is the point at which the “cost of drug” was so high that rats were unwilling to continue “paying” (responding; Pmax; Fig. 7a). Before PSAP/IntA the 0–1 and 2–3 criteria groups did not differ in Pmax, and PSAP/IntA resulted in a significant increase (sensitization) in Pmax in both groups, but the increase in Pmax was significantly greater in 2–3 criteria than in 0–1 criteria rats (effect of BL vs post-PSAP/IntA tests, F(1,13) = 27.57, p = 0.0002; effect of group, F(1,13) = 7.63, p = 0.016; BL/post-test × group interaction, F(1,13) = 9.62, p = 0.0084; p < 0.001, Bonferroni). Also, after weeks of the PSAP/IntA procedure, the demand curves became more inelastic in all rats, and the two groups did not differ on this measure (Fig. 7b; α; effect of BL vs post-PSAP/IntA test, F(1,13) = 10.50, p = 0.0064; effect of group, F(1,13) = 0.79, p = 0.39; BL/post-test × group interaction, F(1,13) = 0.00069, p = 0.98). Together, these findings suggest that prolonged cocaine self-administration using the PSAP/IntA procedure resulted in sensitized motivation for cocaine (increased Pmax and decreased α), but no change in the preferred brain concentration of cocaine (QO).
People with a substance-use disorder often continue taking drug in the face of enduring negative consequences. To model this, we asked whether rats would continue self-administering cocaine despite receiving increasing amounts of foot shock. There was no difference in the Max Charge 0–1 and 2–3 criteria rats were willing to endure to take cocaine before PSAP/IntA experience. However, with prolonged cocaine experience, Max Charge significantly increased in the 2–3 (but not 0–1) criteria rats (Fig. 7d; BL/post-test × group interaction, F(1,13) = 7.35, p = 0.018; effect of BL vs post-PSAP/IntA test, F(1,13) = 0.29, p = 0.60; effect of group, F(1,13) = 1.50, p = 0.24; p < 0.05, 0–1 vs 2–3 criteria rats during post-PSAP/IntA test, Bonferroni). Similar findings have been reported previously (Deroche-Gamonet et al., 2004), where only a small proportion of animals developed compulsive drug use despite negative consequences.
Individual variation in cocaine-induced and cue-induced reinstatement
Even for people who are addicted, but have been able to stop, re-exposure to either their drug of choice or to drug-associated cues can instigate relapse into drug abuse (Anggadiredja et al., 2004). This long-lasting aspect of addiction can be modeled in rats by measuring how a cocaine-priming injection or exposure to a previously drug-paired CS can reinstate the pursuit of drug. In the present study, the reinstatement of drug pursuit was measured after prolonged PSAP/IntA cocaine self-administration (Fig. 1, timeline). First, during a single extinction session, rats meeting 2–3 addiction criteria, compared with the 0–1 criteria rats, responded more on the lever that was previously used to take drug (Fig. 7e; t(13) = 2.72, p = 0.018). The next day, noncontingent intravenous cocaine infusions were administered and these dose-dependently increased responding on the taking lever, regardless of whether rats met “criteria for addiction” (Fig. 7f; effect of drug dose, F(4,52) = 4.01, p = 0.0065; effect of group, F(1,13) = 2.07, p = 0.17; dose × group interaction, F(4,52) = 0.29, p = 0.88). Thus, after being re-exposed to drug, all rats were liable to “relapse” into drug pursuit, regardless of the number of “addiction criteria” they met.
After the drug-reinstatement test, rats underwent seven daily extinction sessions followed by a test for cue-induced reinstatement (CR). Similar to above, on the first (Ext1) and second (Ext2) days of extinction, the 2–3 criteria rats responded more on the lever that was previously used to take drug (Fig. 7g; effect of group, F(1,13) = 32.75, p < 0.0001; effect of session, F(6,78) = 2.53, p = 0.028; effect of group vs session, F(6,78) = 1.80, p = 0.11; 0–1 vs 2–3 criteria rats for Ext1 or Ext2, p < 0.001, Bonferroni), but this group difference was no longer evident after 7 d of extinction (Ext7). Drug seeking was not assessed following extinction and is thus worthy of future investigation.
Next, the cocaine-associated light CS reinstated responding on the taking lever (under extinction conditions) significantly in both groups (Fig. 7h; effect of group, F(1,13) = 14.29, p = 0.0023; effect of Ext7 vs CR session, F(1,13) = 36.44, p < 0.0001; p < 0.001–0.05, Ext7 vs CR for either 0–1 or 2–3 criteria rats, Bonferroni), but this effect was more robust in 2–3 criteria rats relative to rats meeting 0–1 addiction criteria, as indicated by a significant interaction effect (group × Ext7/CR session interaction, F(1,13) = 8.72, p = 0.011; p < 0.001, 0–1 vs 2–3 criteria rats on CR test, Bonferroni). This effect was evident both during the first and second hours of the test (effect of group, F(1,13) = 11.90, p = 0.0043; effect of time, F(1,13) = 0.76, p = 0.40; group × time interaction, F(1,13) = 0.085, p = 0.78; p < 0.01–0.05, 2–3 vs 0–1 criteria rats at either time point, Bonferroni). Thus, following PSAP/IntA experience, re-exposure to cocaine reinstated similar pursuit of drug in all rats, whereas re-exposure to drug-related CSs reinstated greater pursuit of drug in rats characterized as being most “addiction prone.” The different propensities across rats for drug-induced and cue-induced reinstatement suggests a dissociation between their neurobehavioral underpinnings (Epstein et al., 2006). Accordingly, some psychopharmacologic therapies may be ideal for preventing cue-induced relapse to a greater extent than drug-induced relapse (Anggadiredja et al., 2004).
Drug seeking and DA neurotransmission
DA neurotransmission within the ventral striatum (NAc core) is believed to mediate motivated goal-directed drug seeking (i.e., not habitual), while DA signaling within the DLS is thought to underlie habitual drug seeking (i.e., not goal-directed; Everitt, 2014). Given that the PSAP/IntA procedure models prolonged nonhabitual drug-seeking behavior, we predicted that blocking DA signaling in the NAc core, but not in the DLS, would decrease drug-seeking behavior. To test this, after weeks of PSAP/IntA self-administration, we measured drug seeking after microinjecting the DA receptor antagonist flupenthixol (0, 5, or 15 μg) into either the NAc core or DLS. The effect of flupenthixol on drug seeking was dependent upon which dose was injected into what brain region (Fig. 8a; brain region × drug dose interaction, F(2,26) = 8.30, p < 0.0016; brain region, F(1,13) = 3.99, p = 0.067; effect of drug dose, F(2,26) = 2.47, p = 0.10; two-way repeated-measures ANOVA; individual variation not measured due to sample size). When injected into the NAc core, both doses of flupenthixol reduced drug seeking relative to vehicle (p < 0.05, Bonferroni). In contrast, when injected into the DLS, the lower dose of flupenthixol enhanced drug seeking (5 μg; p < 0.05, vs DLS vehicle or 15 μg; p < 0.01, vs NAc 5 μg), but the higher dose of flupenthixol (15 μg) had no effect.
DA and drug seeking after PSAP/IntA experience. The role of DA transmission in the DLS and NAc core was assessed after 4 weeks of drug self-administration using PSAP/IntA. Across three testing sessions, each rat was administered randomized bilateral microinjections (0.5 μl/side; DLS or NAc core) of saline (vehicle) or of 5 μg or 15 μg of the DA receptor antagonist flupenthixol. Following infusion (1 min) and diffusion (1 min) of vehicle or drug, rats were returned to their home cage for 5 min, before being tested in their respective operant chambers. On these sessions, drug seeking was observed on a representative puzzle (#15). The total number of seeking responses was analyzed during the first puzzle-solving trial, before gaining access to the taking lever. a, There was a significant interaction between the dose of flupenthixol and the brain injection site (p < 0.01). Compared with vehicle, blockade of DA signaling in the NAc core reduced drug seeking at both doses of flupenthixol (*p < 0.05). In contrast, 5 μg of flupenthixol injected into the DLS enhanced drug seeking compared with either vehicle injections or 15 μg drug injections into the DLS (*p < 0.05), as well as compared with 5 μg of flupenthixol infused into the NAc core (*p < 0.05). Histological markings for microinjection sites into the NAc core (b) or DLS (c) are shown according to the Paxinos and Watson (2004) brain atlas. NAc core, n = 8; DLS, n = 7. Graphs show mean ± SEM.
The surprising finding that the low dose of flupenthixol into the DLS actually increased drug seeking may be consistent with the idea that the ventral and dorsal striatum interact to regulate drug seeking. Perhaps the DLS serves as a “brake” on aberrant ventral striatal activity and motivational processes. In fact, it has recently been proposed that suppression of the ventral striatum by the DLS may help limit reward seeking to specific contexts in which reward is likely to be available (via processes of conditioned inhibition, although the exact mechanism remains unclear; Schneck and Vezina, 2012). Thus, it could be hypothesized that blockade of DA signaling in the DLS disinhibited drug seeking (as seen following 5 μg flupenthixol) both in the normal cocaine self-administration environment and in locations where the rat had never before experienced drug. Accordingly, this could result in decreased efficiency in seeking and procuring drug (Willuhn et al., 2012).
Together, these findings suggest that, even after prolonged cocaine self-administration under PSAP/IntA conditions, DA in the NAc core retains control over drug-seeking behavior. Furthermore, the surprising observation of enhanced drug-seeking following DA blockade in the DLS may suggest a novel role for this brain region in the regulation of motivated behavior.
Discussion
Each day addicts typically face unique and constantly changing circumstances, and procuring drugs often requires considerable ingenuity and problem solving, conditions not conducive to the development of habits (Gillan et al., 2015; Halbout et al., 2016; Heather, 2017). As put by Tiffany (1990): “A street addict who daily must find a new way of obtaining heroin would never be able to fully automatize those components of his or her drug-use behavior.” Indeed, such individuals have been described as “economic entrepreneurs” (Preble et al., 1969) who must constantly be “taking care of business” (Neale, 2002; Heather, 2017). To model such flexible patterns of drug seeking in rats, a cocaine self-administration procedure (PSAP) was developed that required rats to solve a new problem (puzzle) each day to gain access to cocaine, which was then taken on an IntA schedule (Zimmer et al., 2012; Kawa et al., 2016). This procedure precluded S-R seeking habits, but nevertheless produced addiction-like behavior, especially in susceptible rats. Furthermore, cocaine seeking was reduced by DA antagonism in the NAc core, but not the DLS. We conclude that neither S-R habits nor a transfer of behavioral control from the ventral to the dorsal striatum are necessary for the development of addiction-like behavior in rats.
Puzzle self-administration procedure
What is the evidence that drug-seeking behavior during PSAP/IntA was not controlled by S-R habits? While performing and sharing this research, we have heard the comment that maybe the rats “get into the habit” of solving puzzles. This comment underscores the importance of differentiating the colloquial use of the word “habit” from its scientific definition. In psychology, habits refer to stereotyped, automatic, and inflexible behaviors that through overtraining come to be evoked by specific stimuli (S-R), largely independent of the value of the goal (Dickinson, 1985, Dickinson et al., 1995; Graybiel, 2008; Everitt, 2014; Gasbarri et al., 2014; Wood and Rünger, 2016). That does not characterize cocaine-seeking behavior in the present study. For example, seeking behavior decreased dramatically when the tone that signaled completion of each response component of the daily puzzle was omitted, indicating that rats remained sensitive to the tone's consequences. Also, in rats trained to seek and take sucrose using the PSAP, devaluation of the reward decreased responding. Furthermore, during PSAP/IntA the rats never made >∼45% correct responses, so they frequently had to restart a given puzzle. Both within and between sessions they had to struggle to solve the daily puzzle necessary to get access to cocaine, and they became increasingly motivated to do so. Therefore, the puzzles were sufficiently demanding that seeking behavior could never become “automatized.”
Tests for addiction-like behavior
What is the evidence that the rats developed addiction-like behavior? As in other studies on this topic (Deroche-Gamonet et al., 2004; Belin and Everitt, 2008), we asked whether drug experience produced symptoms diagnostic of substance-use disorders (DSM-5; American Psychiatric Association, 2013). The development of addiction-like behavior was indicated by the following: (1) an increase in how avidly cocaine was sought (seeking responses/min); (2) escalation of intake; (3) a greater willingness to defend the preferred level of consumption as cost increased, in either effort required (increased Pmax and decreased α) or upon the imposition of an adverse consequence (Max Charge); (4) resistance to extinction; and (5) very robust cue-induced “relapse.” We suggest these effects were likely due to enhanced incentive motivation (incentive-sensitization), because when cocaine had negligible cost, consumption was unchanged (QO; Kawa et al., 2016). Although highly speculative, this is suggestive of increased “wanting,” but not “liking” (Robinson and Berridge, 1993).
However, there is considerable individual variation in susceptibility to addiction, and most people who try cocaine do not go on to develop addiction (Anthony et al., 1994). There was also considerable individual variation in addiction-like behavior in the present study. Although PSAP/IntA experience increased motivation for drug in most rats, on some measures it was especially effective in doing so in rats identified as “addiction-prone” (2–3 criteria rats). It is critical to note that 0–1 and 2–3 criteria rats did not differ before PSAP/IntA experience, but this experience produced more robust incentive-sensitization in 2–3 criteria rats.
PSAP was coupled to the recently developed IntA self-administration procedure to better mimic patterns of cocaine taking in humans, especially during the transition to addiction, when the pattern of cocaine use is very intermittent, both between and within bouts of use (Beveridge et al., 2012; Zimmer et al., 2012; Allain et al., 2015; Kawa et al., 2016). Under IntA conditions, rats take much less cocaine than with more common long-access (LgA) procedures, in which rats have continuous access for ≥6 h (Ahmed and Koob, 1999; Zimmer et al., 2012). Despite taking much less drug, IntA produces a greater increase in motivation for cocaine than LgA (Zimmer et al., 2012; Kawa et al., 2016). Furthermore, IntA produces psychomotor sensitization, and the degree of psychomotor sensitization predicts the magnitude of the increase in motivation for drug (Allain et al., 2017) and results in sensitized DA neurotransmission (Calipari et al., 2014). Finally, the magnitude of cue-induced reinstatement seen here (∼150 responses/h) and by Kawa et al. (2016) was much greater than typically seen with either short-access or LgA procedures (60–80 responses/h; Grimm et al., 2003; Saunders and Robinson, 2010). These findings suggest that the temporal pattern of cocaine use has an important role in influencing the development of addiction-like behavior (Allain et al., 2015), even in the absence of S-R habits.
Drug seeking and DA neurotransmission
It is often argued that, with prolonged drug self-administration, regulation over drug seeking shifts from being controlled by DA transmission in the NAc, to DA signaling in the DLS (Ito et al., 2002; Di Ciano and Everitt, 2004; Vanderschuren et al., 2005; Belin and Everitt, 2008; Zapata et al., 2010). Based on this functional neuroanatomy, S-R habit hypotheses of addiction suggest that drug seeking transitions from being regulated by A-O associations and S-O motivational processes, to being dictated by S-R habits (Everitt, 2014). Given that we found that drug-seeking habits are not necessary for the development of addiction-like behavior, we asked whether DA neurotransmission in the NAc and/or DLS regulate drug seeking following PSAP/IntA. The inhibition of DA receptors in the NAc, using the DA receptor antagonist flupenthixol, reduced drug seeking (at both doses tested). In contrast, inhibition of DA receptors in the DLS either enhanced (low dose) or had no effect (high dose) on drug seeking. This suggests that the development of addiction-like behavior may not require a transfer of DA control from the ventral to the dorsal striatum.
Other evidence suggests that linking the DLS only to S-R habits may be oversimplistic. Elegant experiments disconnecting the unilateral NAc core from the contralateral DLS suggest that communication between these regions is necessary for drug seeking (Belin and Everitt, 2008). Others have shown that the DLS regulates motivated responding to cues (DiFeliceantonio and Berridge, 2016) and action–outcome associations (Burton et al., 2017). Also, lesions of either the ventral or dorsal striatum reduce motivated responding for cocaine on a progressive ratio schedule (Suto et al., 2011). Furthermore, across short-access cocaine self-administration sessions (3 weeks, 1 h/d) DA transmission shifts from the NAc to the DLS in the absence of drug-seeking habits (Willuhn et al., 2012) and, surprisingly, there is no such shift in DA signaling when rats are trained using LgA procedures (despite escalating drug intake; Willuhn et al., 2014). In contrast, imaging studies of substance abusers demonstrate greater DA signaling in the dorsal striatum than in the NAc when they are presented with drug cues (Volkow et al., 2006; but also see evidence for release in the NAc; Boileau et al., 2007; Vollstädt-Klein et al., 2010; Leyton and Vezina, 2012; Jasinska et al., 2014). While this has been characterized as the “activation of DA pathways that trigger the behavioral habits leading to compulsive drug seeking and consumption” (Volkow et al., 2006), cues were presented noncontingently and not during the performance of a S-R habit. Therefore, it's difficult to say whether the dorsostriatal activations observed in cocaine addicts reflect habitual or incentive motivational processes.
Conclusion
Cocaine self-administration using PSAP coupled with IntA, which precluded the development of S-R drug-seeking habits, nevertheless resulted in the emergence of addiction-like behavior, especially in susceptible rats. Furthermore, under these conditions cocaine seeking required intact DA neurotransmission in the core of the NAc, but not in the DLS. The nature of the psychological and neural processes that control behavior are very dependent on the conditions under which behavior is studied, and some drug self-administration procedures may be useful for studying the automated habits that sometimes characterize drug consumption. However, the procedures described here may better model patterns of drug-seeking and drug-taking behavior as drug users transition to addiction. Thus, such procedures may be especially useful in determining what changes in what neuropsychological processes lead to this transition.
Footnotes
This work was supported by grants from the National Institute on Drug Abuse to B.F.S. (F32 DA038383-01, T32 DA007268-21) and T.E.R. (P01 DA031656). We thank Dr. Aldo Badiani (University of Sussex), Dr. Kent Berridge (University of Michigan), Dr. Hans Crombag (University of Sussex), and Dr. Anna Samaha (Université de Montréal) for their comments on an earlier version of the manuscript. We thank several undergraduate students for their help with the project, including Sarah Burke, Cody Carter, Yazmyn Cross, Jeffrey Hoshal, Sarah Lopez, Melanie Schweir, Brett Wietecha, Erin Wright, and Joyce Xia.
The authors declare no competing financial interests.
- Correspondence should be addressed to Dr. Bryan F. Singer, School of Life, Health and Chemical Sciences, Faculty of Science, Technology, Engineering and Mathematics, The Open University, Milton Keynes, UK MK7 6AA. bryan.singer{at}open.ac.uk