Abstract
Rats trained to perform a version of the rat gambling task (rGT) in which salient audiovisual cues accompany reward delivery, similar to commercial gambling products, show greater preference for risky options. Given previous demonstrations that probabilistic reinforcement schedules can enhance psychostimulant-induced increases in accumbal DA and locomotor activity, we theorized that performing this cued task could perpetuate a proaddiction phenotype. Significantly more rats developed a preference for the risky options in the cued versus uncued rGT at baseline, and this bias was further exacerbated by cocaine self-administration, whereas the choice pattern of optimal decision-makers was unaffected. The addition of reward-paired cues therefore increased the proportion of rats exhibiting a maladaptive cognitive response to cocaine self-administration. Risky choice was not associated with responding for conditioned reinforcement or a marker of goal/sign-tracking, suggesting that reward-concurrent cues precipitate maladaptive choice via a unique mechanism unrelated to simple approach toward, or responding for, conditioned stimuli. Although “protected” from any resulting decision-making impairment, optimal decision-makers trained on the cued rGT nevertheless self-administered more cocaine than those trained on the uncued task. Collectively, these data suggest that repeated engagement with heavily cued probabilistic reward schedules can drive addiction vulnerability through multiple behavioral mechanisms. Rats trained on the cued rGT also exhibited blunted locomotor sensitization and lower basal accumbal DA levels, yet greater cocaine-induced increases in accumbal DA efflux. Gambling in the presence of salient cues may therefore result in an adaptive downregulation of the mesolimbic DA system, rendering individuals more sensitive to the deleterious effects of taking cocaine.
SIGNIFICANCE STATEMENT Impaired cost/benefit decision making, exemplified by preference for the risky, disadvantageous options on the Iowa Gambling Task, is associated with greater risk of relapse and treatment failure in substance use disorder. Understanding factors that enhance preference for risk may help elucidate the neurobiological mechanisms underlying maladaptive decision making in addiction, thereby improving treatment outcomes. Problem gambling is also highly comorbid with substance use disorder, and many commercial gambling products incorporate salient win-paired cues. Here we show that adding reward-concurrent cues to a rat analog of the IGT precipitates a hypodopaminergic state, characterized by blunted accumbal DA efflux and attenuated locomotor sensitization, which may contribute to the enhanced responsivity to uncertain rewards or the reinforcing effects of cocaine we observed.
- conditioned reinforcement
- dopamine
- locomotor sensitization
- microdialysis
- rat gambling task
- self-administration
Introduction
Substance use disorder (SUD) is associated with maladaptive cost/benefit decision making, as exemplified by impaired performance of the Iowa Gambling Task (IGT) (Rogers et al., 1999; Bolla et al., 2005; Verdejo-García et al., 2007; Verdejo-García and Bechara, 2009), a probabilistic decision-making task. Risky choice on the IGT predicts relapse, suggesting that decision-making deficits directly contribute to addiction maintenance (Bechara, 2003; Stevens et al., 2013; Wang et al., 2013). Animals that prefer the “high-risk, high-reward” options on a rat gambling task (rGT) loosely based on the IGT are uniquely susceptible to decision-making deficits induced by cocaine self-administration (Ferland and Winstanley, 2017), suggesting that maladaptive choice may both foster and result from drug abuse. Understanding the neurocognitive processes that promote risky decision making may therefore provide significant insight into the etiology of addiction and how to treat it.
Individuals who are more sensitive to the behavioral impact of reward-paired cues may be most at risk for addiction development (Phillips and Fibiger, 1990; Robinson and Berridge, 1993, 2008; Tomie, 1996; Tomie et al., 2008; Flagel et al., 2009, 2010). Considerable supporting evidence has been obtained using basic behavioral procedures, such as sign-tracking, which measures approach to reward-paired cues (Brown and Jenkins, 1968), or conditioned reinforcement (CRf), which reflects willingness to work for conditioned stimuli (Williams, 1994). However, the impact of cue sensitivity on cost/benefit decision making itself is poorly understood.
Win-paired audiovisual cues are heavily used in electronic gambling machines (EGMs), arguably the most addictive form of gambling (Dowling et al., 2005; Livingstone and Adams, 2011). Problem gambling is highly comorbid with SUD (Chou and Afifi, 2011; Lorains et al., 2011), indicating a potential synergy between risky choice, cue sensitivity, and addiction vulnerability. Adding win-paired, audiovisual cues augments choice of the risky, disadvantageous options on the rGT while rendering choice sensitive to D3 receptor modulators (Barrus and Winstanley, 2016). D3-selective drugs generally have limited effect on behaviors reinforced by nutritive rewards but significantly influence responding for addictive substances, particularly under densely cued schedules of reinforcement (Le Foll et al., 2005; Beninger and Banasikowski, 2008). As such, the ability of reward-concurrent cues to promote risky decision making on the rGT may tap into the same neurocognitive mechanisms by which cues facilitate engagement in addictive behaviors.
The D3 receptor is densely expressed within the NAc (Sokoloff et al., 2006), a brain area strongly implicated in the development of addiction, the response to conditioned cues, and risky decision making (Sugam et al., 2012, 2014; Bissonette and Roesch, 2016). Specifically, manipulation of dopamine (DA) within the NAc can bias rats toward either risk-averse or risk-seeking decisions (Saddoris et al., 2015; Zalocusky et al., 2016). Therefore, the influence of reward-concurrent cues on the rGT may drive risky choice by altering dopaminergic signaling in this region. Repeated experience with probabilistic reward delivery can itself be sufficient to sensitize dopaminergic neurons in a similar way to drugs of abuse, increasing amphetamine-induced locomotion and DA release in the NAc, and also potentiating amphetamine self-administration (Singer et al., 2012; Zack et al., 2014; Mascia et al., 2019). We reasoned that preference for the risky options during the rGT may similarly sensitize the response to drugs, such as cocaine, and this may be potentiated in the cued variant of the task, such that cued rGT training encourages a proaddiction phenotype.
We therefore investigated whether prior experience with the cued versus uncued rGT altered (1) self-administration of cocaine, (2) the ability of cocaine self-administration to promote risky choice, (3) locomotor sensitization to cocaine, and (4) accumbal DA release. We hypothesized that experience with win-paired cues and the disadvantageous, uncertain options of the task would sensitize the motoric and reinforcing effects of cocaine, an effect mediated by increased NAc DA tone. Given that cue sensitivity has been associated with addiction risk, we also hypothesized that the reward-paired cues in the rGT may amplify risky choice through acting as conditioned reinforcers and thereby motivate operant responding. We therefore expected cue-biased risky choice to correlate positively with responding for CRf.
Materials and Methods
Subjects
Experiment 1 used male Long–Evans rats from three separate cohorts (n = 48; Charles River Laboratories) weighing 275–300 g at the start of the experiment. Experiment 2 used 32 male transgenic rats, bred in house against a Long–Evans background, that express cre recombinase (Cre) in neurons containing tyrosine hydroxylase (TH:Cre rats from; Long–Evans-Tg(TH-Cre)3.1Deis, RRRC #00659; Rat Resource and Research Centre; WT rats obtained from Charles River Laboratories). However, transgene (Tg) status (n = 16 per Tg, Tg+, Tg−) was not used for any experimental manipulation. Rats bred in-house were weaned at postnatal day 21, housed in groups of 2 or 3 animals per cage, and had access to ad libitum standard rat chow until an average weight of ∼300 g was reached. Rats were then transferred to the main vivarium. In all experiments, animals were food-restricted to 85% of their free-feeding weight relative to a standard growth curve. Water was available ad libitum for both experiments. Rats were housed under a reverse 12 h light/dark cycle (lights off at 8:00 A.M.) in a temperature-controlled colony room maintained at 21°C. Testing and housing were in accordance with the Canadian Council of Animal Care, and all experimental protocols were approved by the Animal Care Committee of the University of British Columbia.
Apparatus
The rGT, CRf, and self-administration behavioral protocols were conducted in standard 5-hole operant chambers enclosed within ventilated sound-attenuating cabinets (Med Associates). To minimize task interference, cocaine self-administration and CRf were run in separate operant boxes kept in a different room within the facility. An array of five response holes was positioned on one wall of the operant chambers. The food magazine, positioned 2 cm above the rod-floor located opposite the aperture array, was attached to an external food dispenser equipped to deliver sucrose pellets (45 mg, Bioserv) to the magazine. Two retractable levers were situated on either side of the food magazine for use during self-administration or CRf. A light stimulus was situated at the back of each response hole as well as within the food magazine, and nose-poke responses into these apertures were detected by a horizontal infrared beam. Boxes were equipped with variable tone generators, and chambers could be illuminated by a house light. All manipulanda were controlled by software written in Med PC by CAW and CVH running on an IBM-compatible computer.
Self-administration boxes were fitted with an infusion apparatus consisting of a variable rate infusion pump (Med Associates), a 10 ml plastic syringe used to administer drug or vehicle, PE/PVC tubing (Instech Solomon) connected to a 22 gauge single-channel plastic swivel (Instech Solomon), and a 40 cm spring-covered tubing connector assembly (Plastics One). Locomotor testing was completed in 40 cm2 Plexiglas boxes fitted with video cameras, and ambulatory activity was counted using behavioral tracking software (Ethovision 3.1, Noldus).
Experimental timeline
An overview of experimental designs is provided in Figure 1A, B. For Experiment 1, Cohorts 1 and 2 were evenly divided into experimental groups matched for cued rGT performance and trained to self-administer cocaine or saline. Rats from Cohort 3 trained on rGT or cued rGT underwent cocaine self-administration only. Animals underwent consecutive daily rGT and cocaine self-administration sessions, such that the progressive effects of voluntary cocaine ingestion on decision-making behavior could be tracked. rGT sessions were run in the mornings, before self-administration in the afternoons, to ensure rGT performance was not acutely impacted by cocaine dosing. In Experiment 2, locomotor testing was first performed before the start of cued/uncued rGT training, and then repeated immediately after behavioral stability was reached on the decision-making task, followed by microdialysis.
Behavioral testing
rGT.
Animals were first habituated to the operant chambers over two daily 30 min sessions followed by nose-poke and forced-choice training as described in a previous report (Zeeb et al., 2009). Following training, animals proceeded to the free-choice version of the program. A task schematic is provided in Figure 2. During each 30 min behavioral session, the animal initiated each trial by making a nose-poke response at the illuminated food magazine. This response turned off the magazine light and triggered the start of a 5 s intertrial interval (ITI), after which nose-poke apertures 1, 2, 4, and 5 were illuminated. A nose-poke response at one of the illuminated holes resulted in either delivery of a sugar reward or a variable duration punishment time-out period during which the light in the chosen aperture flashed at 0.5 Hz. Each hole was associated with a different amount of reward (1–4 sugar pellets, P1-P4), length of penalty time-out (5–40 s), and probability of winning a reward over punishment (0.9–0.4). The cued rGT was identical to the uncued task, except that reward delivery was paired with compound light/tone cues that scaled in complexity with the size of the win (Barrus and Winstanley, 2016; Adams et al., 2017). The reinforcement contingencies were designed such that consistent choice of options that yielded larger per trial gains (P3, P4) ultimately resulted in fewer sugar pellets over the course of a session due to the longer and more frequent time-out penalties incurred. Hence, the optimal strategy was to favor the options paired with smaller incremental gains, particularly P2. The locations of choice options were counterbalanced between animals to control for potential side bias (Version A: left to right, P1, P4, P2, P3; Version B: P4, P1, P3, P2).
Nose-poke responses made at the array during the ITI were recorded as premature responses, a measure of impulsive action similar to that obtained from the five-choice serial reaction time task (Robbins, 2002). Such responses were punished by a 5 s time-out penalty, signaled by illumination of the house light, after which the tray light was turned on and the animal could initiate a new trial. If rats failed to make a response at the illuminated apertures within 10 s, the trial would be registered as an omission, and the food tray again illuminated indicating that a new trial could be initiated. Animals received five or six daily sessions per week until statistically stable patterns of behavior across all measures were observed over three sessions. This took ∼35 sessions to achieve.
Cocaine self-administration
Animals were trained to lever press for cocaine hydrochloride (0.75 mg/kg/infusion, dose calculated as the salt and dissolved in sterile 0.9% saline; Medisca Pharmaceuticals) or saline vehicle over 10 daily 3 h sessions (Calu et al., 2007; Ferland and Winstanley, 2017). At the start of each self-administration session, two free infusions of solution were given to fill catheters and indicate that drug was available. Rats were presented with two levers, one active and one inactive, with an illuminated cue-light situated over the active lever. Using a fixed ratio (FR1) schedule, responses on the active lever would result in a single 4.5 s infusion in concert with the cue light flashing (50 Hz) and a 20 kHz tone. Following the infusion, animals underwent a 40 s time-out during which drug would not be delivered, the cue light and tone were extinguished, but levers would remain extended and responses monitored. Responses on the active lever during infusions and timeouts were recorded and interpreted as preliminary cocaine “seeking” behaviors. Inactive lever presses, while monitored, had no programmed consequences. Animals were limited to 30 infusions per hour to prevent overdose.
Locomotor activity assay
As in previous studies (Singer et al., 2012; Zack et al., 2014), animals were allowed a 1 h habituation period to the locomotor chamber, after which a 1 ml/kg intraperitoneal injection of saline was administered. One hour later, animals then received a 10 mg/kg intraperitoneal injection of cocaine. Recording continued for a further hour. Total distance traveled (centimeters) was calculated and parsed into 5 min bins for analyses.
CRf
CRf testing was completed after the final locomotor sessions had concluded. In 10 daily 1 h sessions, subjects were first trained to associate a 5 s conditioned stimulus (CS+), which consisted of illumination of a circular disc cue light, paired with delivery of a food pellet into the central magazine. Animals had equal experience with an identical circular cue light control stimulus (CS−), the presentation of which had no programmed consequences. The CS+ differed from the CS− only in its spatial location (i.e., whether it appeared on the left or right of the food magazine), and this was counterbalanced across subjects. Stimulus presentation occurred on a random interval 60 s schedule in a pseudorandom sequence such that the same stimulus was not presented on more than two consecutive trials. The offset of the CS+ was contingent with the delivery of a food pellet in the central magazine, whereas the CS− was never associated with food reward. The latency to nose-poke at the food tray after illumination of the CS+ or CS− was recorded to provide an index of goal-tracking: that is, the tendency of animals to approach the location of reward (“goal”) following presentation of the CS+ (Flagel et al., 2007). Whether the CS+ had become sufficiently reinforcing as to support the acquisition of a novel operant response was then assessed on day 11. At the start of a single 1 h session, the two retractable levers were inserted into the operant chamber, each located on either side of the food tray, underneath the cue lights used as conditioned stimuli. Responding on each lever resulted in illumination of the cue light located immediately above it (the CS+ or CS−) on an FR1 schedule. No sugar pellets were delivered at any time; lever press responses were reinforced solely by CS presentation. The number of responses made for the CS+ and CS− were recorded and calculated as a ratio of the total number of responses for the CS+/total number of lever presses. The total number of nose-pokes made at the food tray during presentation of the conditioned stimuli, as well as the latency to the first nose-poke after CS illumination, were also recorded.
Surgeries
For all surgeries, rats were anesthetized with isoflurane (5% induction, 2%–3% maintenance), given 5 mg/kg ketaprofen as analgesic, and bupivacaine was administered at the surgical site as a local anesthetic.
Experiment 1: jugular vein catheterization.
Animals were aseptically implanted with catheters constructed of Silastic silicone tubing (Dow Corning via VWR International) attached to back-mounted cannulae (Plastics One) into the right jugular vein. Catheters were passed through the skin subcutaneously and positioned, such that the cannulae exited between the shoulder blades. To prevent blockages, catheters were flushed daily with 0.1 ml of 50% heparinized saline. Behavioral testing on the cued and uncued rGT resumed after 5–7 d of recovery.
Experiment 2: cannulae implantation for microdialysis.
Twenty-eight animals underwent aseptic stereotaxic surgery. As per previous work (Vacca et al., 2007), animals were bilaterally implanted with 15 mm 19 gauge nitric acid passivated stainless-steel guide cannulae above the NAc (from bregma 1.7 mm anterior and ±1.1 mm lateral; from dura −1.0 mm ventral) (Paxinos and Watson, 1998), secured via skull screws and dental cement. Stainless-steel obdurators (15 mm) maintained patency of the guides until probe implantation. Remaining animals (n = 4, Tg+) were unexpectedly required in the breeding colony and were removed from the experiment.
In vivo microdialysis
Microdialysis took place in 40 cm3 Plexiglas chambers. Microdialysis probes were constructed from Filtral 12 AN69HF semipermeable hollow fibers (2 mm long, 340 μm OD, 65 kDa molecular weight cutoff) and silica inlet-outlet lines (75/150 μm ID/OD). The day before microdialysis experiments, probes were flushed with aCSF (10.0 mm sodium phosphate buffer with 147.0 mm NaCl, 3.0 mm KCl, 1.0 mm MgCl2, and 1.2 mm CaCl2, pH 7.4) and inserted into the guide cannulae (dialysis membrane spanned −4.8 to −6.8 mm ventral from the guide). Rats remained in the microdialysis chamber overnight (14–16 h) with freely available food and water, and the probes were perfused continuously with aCSF at a rate of 1.1 μl/min. The following morning, DA levels were determined in dialysates collected at 10 min intervals. Once a stable baseline was established (<10% fluctuation over four consecutive samples, ∼8 baseline samples taken per rat), animals were administered a saline injection (1 ml/kg i.p.). One hour later, animals received an injection of cocaine (10 mg/kg i.p.). Dialysates were collected for a further 1 h period. To minimize sensitization to cocaine, 2–4 d elapsed before this process was then repeated for the opposite hemisphere. The first hemisphere sampled (left/right) was counterbalanced across animals.
HPLC
Samples were analyzed via HPLC with electrochemical detection. HPLC systems were composed of the following: an ESA 582 pump (Bedford), a pulse damper (Scientific Systems), an inert manual injector (Rheodyne), a Super ODS TSK column (Tosoh Bioscience), and an Intro Electrochemical detector (Antec Leyden). The mobile phase [70 mm sodium acetate buffer, 40 mg/L EDTA, and 6 mg/L SDS (adjustable), pH 4.0, 10% methanol] flowed through the system at 0.15 ml/min. EZChrome Elite software (Scientific Software) was used to acquire and analyze chromatographic data. After the experiment, animals were killed by live decapitation, brains were sectioned at −20°C on a cryostat, and sections were stained with cresyl violet. Probe placements were verified with reference to a stereotaxic atlas (Paxinos and Watson, 1998).
Experimental design and statistical analysis
All statistical analyses were completed using SPSS Statistics 24.0 software (IBM). As per previous reports, the following rGT variables were analyzed at baseline (Experiments 1 and 2) and during self-administration (Experiment 1 only): percentage choice of each option (number of times option chosen/total number of choices × 100), score (calculated as choice of [(P1 + P2) − (P3 + P4)]), percentage of premature responses (number of premature responses/total number of trials initiated × 100), sum of omitted responses, sum of trials completed, and average latencies to choose an option and collect reward. Variables that were expressed as a percentage were subjected to an arcsine transformation to limit the effect of an artificially imposed ceiling (i.e., 100%) (McDonald, 2009). A statistically stable baseline was determined by a repeated-measures ANOVA across data from three consecutive sessions, in which both the session factor and session × choice interaction were not significant. To verify that Tg status did not alter any behavioral measure, stand-alone analyses with Tg (two levels: Tg+, Tg−) as a between-subjects factor were conducted for all rGT and cued rGT variables.
For the self-administration phase, we analyzed infusions achieved, active lever responses, and inactive lever responses via mixed factorial repeated-measures ANOVA. We entered session as the within-subjects factor and, unless noted otherwise, set the between-subjects factors as task and risk-preference. Significant effects from the omnibus tests were followed up with a combination of single factorial repeated-measures ANOVAs, Fisher's Least Significant Difference (LSD) post hoc test, and/or independent samples t tests. The total responses on each lever during CRf sessions in Experiment 2 were subject to repeated-measures ANOVA (two levels: active/ CS+, inactive/ CS−). CRf ratio data were also subjected to a univariate ANOVA. For the locomotor activity assay, the total distance traveled (centimeters) was analyzed by repeated-measures ANOVA with treatment (two levels: saline, cocaine) and time bin (6 levels: 6 × 5 min bins) as within-subjects factors. Only the first 30 min of behavior were analyzed to capture locomotor counts while cocaine was maximally effective onboard (Ciccarone, 2011). Averaged raw DA dialysate concentrations collected from each hemisphere were subjected to a repeated-measures ANOVA (treatment: 3 levels: baseline, saline, cocaine; bin: 6 levels: 6 ×10 min bins). Four rats had a substituted baseline value from an earlier measurement for one hemisphere due to unusual deviation from an otherwise stable baseline (i.e., 1 of 4 baseline values had ∼> ±10% variability, n = 2/task). Measurement of DA following saline and cocaine administration was also transformed into percentage change from baseline, and again analyzed with a repeated-measures ANOVA (5 levels, 5 × 10 min bins after injection).
In the first experiment, we excluded 16 rats from the experimental cohort due to occlusion of catheters resulting in a cued group of n = 37 and an uncued group of n = 11. In the cocaine group, animals with a mean positive baseline score were designated as “optimal decision-makers” (n = 20), whereas rats with negative scores were classified as “risk-preferring” (n = 16). The control (i.e., saline self-administering) cohort consisted of 12 rats (risk-preferring: n = 6; optimal: n = 6). For Experiment 2, all variables from the rGT, locomotor data, and responding for CRf were analyzed with rGT variant (uncued/cued, n = 16 per task) and risk-preference (optimal, risk-preferring, n = 16 per group; uncued optimal, n = 12; uncued risk-preferring, n = 4; cued optimal, n = 12; risk-preferring, n = 12) as between-subjects factors. For both experiments, the frequency of risk-preferring rats trained on each task was evaluated using a χ2 test followed by a Fisher Exact test (to correct the p value due to low number of animals in a specific group; i.e., n = 4 optimal rats in the cued rGT in Experiment 2). Six animals were excluded from microdialysis analyses due to inaccurate probe placement, and 1 animal's data were excluded due to illness. Analysis of DA efflux included rGT variant as a between-subjects factor (n = 11 per cued/uncued task) but not risk-preference due to lack of individual differences seen in locomotor data (see Results).
For rGT, cued rGT, and CRf behavior, data were excluded from any animal that performed <20 trials (Winstanley et al., 2011). For all analyses, if sphericity was violated as determined by Mauchley's test, a Huynh–Feldt correction was applied, and corrected p values' degrees of freedom rounded to the next integer. Results were deemed to be significant if p values were less than or equal to an α of 0.05. Analyses yielding a p value between 0.05 and 0.08 were reported as a trend.
Results
Experiment 1: effects of cocaine self-administration on performance of the cued versus uncued rGT
Of the 16 rats trained on the uncued rGT, 100% developed an optimal decision-making strategy. In contrast, of the 48 rats trained on the cued variant of the task, 22 were optimal decision-makers and 26 were risk-preferring (χ2 test: χ2(1) = 14.60, p < 0.001). As per results from previous studies (Barrus and Winstanley, 2016; Adams et al., 2017), animals trained on the cued rGT showed greater preference for the risky options (Table 1; cued vs uncued: choice × task: F(3,119) = 9.626, p < 0.0001; individual options: P1: t(46) = 2.270, p = 0.028; P2: t(46) = 3.475, p = 0.001; P3: t(46) = −3.460, p = 0.001; t(46) = −2.134, p = 0.038; score: task: F(1,46) = 21.422, p < 0.0001). Similarly, as expected, risk-preferring rats showed a statistically significant increase in choice of the risky options (choice × risk preference: F(3,124) = 33.657, p < 0.0001; optimal versus risk preferring: P1: t(46) = 2.186, p = 0.035; P2: t(46) = 8.632, p < 0.0001; P3: t(46) = −5.914, p < 0.0001; P4: t(46) = −3.963, p = 0.0004; score: F(1,46) = 171.198, p < 0.0001)
Risk-preferring rats also completed significantly fewer trials overall (risk preference: F(1,45) = 38.943, p < 0.0001), and cued animals omitted significantly fewer trials, although no animals omitted more than two trials on average per session (task: F(1,45) = 22.868, p < 0.0001). Risk-preferring rats and animals trained on the cued task had significantly faster collection latencies (risk preference: F(1,45) = 5.640, p = 0.022; task: F(1,45) = 5.228, p = 0.027), but exhibited no differences in choice latencies (F values < 1.961, p > 0.168). Before self-administration, rats assigned to the cocaine group did not differ from saline rats on any behavioral measure (F values < 2.748, p > 0.104).
Self-administration training
Over 10 sessions of training, animals self-administering cocaine took increasing numbers of cocaine infusions (session: F(9,234) = 5.900, p < 0.001). We then collapsed across groups and followed up with a linear contrast to show that all rats took more cocaine with each progressive self-administration session (F(1,27) = 29.641, p < 0.001; Fig. 3A). In contrast, animals responded steadily less on the inactive lever over time (session: F(9,234) = 3.057, p = 0.021; Fig. 3B; linear contrast: F(1,26) = 12.090, p = 0.002). Critically, we noted a between-subjects effect of task (F(1,27) = 4.410, p = 0.045) and of risk preference (F(1,27) = 11.875, p = 0.002). The task × risk preference interaction term could not be computed, however, as no risk-preferring rats were present in the uncued group. We therefore parsed the animals into three groups for follow-up analyses [uncued optimal (n = 11), cued optimal (n = 9), and cued risk-preferring (n = 16)]. We then used Fisher's LSD to show that optimal decision-makers on the cued task took more cocaine than both their risk-preferring counterparts (Fig. 3; MD = 19.664, p = 0.002), and optimal decision-makers performing the uncued rGT (MD = 12.997, p = 0.045), neither of which differed from each other (MD = 6.668, p = 0.215). A similar pattern of effects was observed when the number of responses on the active lever was analyzed (group: F(2,26) = 4.175, p = 0.027; Fig. 3B), in that optimal decision-makers trained on the cued rGT made significantly more responses than risk-preferring rats (MD = 37.967, p = 0.009), or optimal decision-makers trained on the uncued task (MD = 32.850, p = 0.032) whereas these latter two groups did not differ from each other (MD = 5.117, p = 0.666).
Control animals self-administering saline took steadily fewer infusions over time (session: F(9,81) = 6.383, p = 0.032; linear contrast: F(1,9) = 25.114, p = 0.001). Similarly, animals made fewer responses on the active lever with continued training (session: F(9,81) = 2.648, p = 0.037; linear contrast: F(1,9) = 16.795, p = 0.003). Responding on the inactive lever did not change over the duration of self-administration (session: F(9,81) = 1.158, p = 0.333).
rGT performance
rGT performance was significantly affected by cocaine versus saline self-administration (drug: F(1,34) = 4.571, p = 0.040). The score of saline self-administering animals did not change, regardless of basal levels of risky choice (session × risk preference: F(9,81) = 1.376, p = 0.213). However, the score variable decreased significantly and selectively in risk-preferring rats self-administering cocaine, across both the cued and uncued task (Fig. 4: session × risk preference: F(9,225) = 2.285, p = 0.037; session: risk-preferring only: F(9,117) = 2.666, p = 0.012; optimal decision-makers only: F(9,117) = 1.188, p = 0.325). Importantly, this effect was seen in all cohorts tested (session × cohort: F(10,324) = 1.411, p = 0.181; session × drug × cohort: F(7,306) = 1.636, p = 0.129). Premature responding also declined during the self-administration epoch in all rats, an effect that tended to be more pronounced in rats trained on the uncued task (Table 2; session × task: F(9,225) = 2.693, p = 0.006; session: cued only: F(9,198) = 4.642, p < 0.001; uncued only: F(9,45) = 2.540, p = 0.019). Animals performing the uncued task also completed fewer trials during this phase of the experiment (Table 2; session × task: F(9,252) = 6.775, p < 0.001; session: cued only: F(9,189) = 1.055, p = 0.384; uncued: F(9,72) = 6.664, p < 0.001). We observed no differences in omissions or latency to collect reward (i.e., all terms of both repeated-measures ANOVAs were nonsignificant), although choice latency increased across all rats (session: F(7,190) = 6.690, p < 0.001; session × task: F(7,190) = 0.74, p = 0.641; session × risk-preference: F(7,190) = 1.764, p = 0.096). In contrast, we detected no change in score, premature responding, trials completed, omissions, collect latency, or choice latency (i.e., all terms of omnibus ANOVAs were nonsignificant) in animals self-administering saline (Table 2).
Rats weighed between 375 and 450 g during this phase of the experiment. However, weight did not predict any change in score over self-administration training (r = −0.082, p = 0.709). Although rats tended to lose weight after self-administering cocaine, the degree of weight loss also did not correlate with change in score (r = 0.295, p = 0.172).
Experiment 2: locomotor activity, CRf, and DA efflux within the NAc of rats trained to perform the cued or uncued rGT
Similar to Experiment 1, the majority of rats trained on the uncued rGT were optimal decision-makers (12 of 16), whereas the opposite pattern was observed for the cued task (4 of 16; χ2 test: χ2(1) = 8.00, p = 0.012). As in Experiment 1, rats trained on the cued rGT task showed higher levels of risky choice that could be largely attributed to a greater preference for P3 (Table 3; score: task: F(1,28) = 8.928, p = 0.006; choice: option × task: F(3,72) = 2.619, p = 0.066). Rats trained on the cued rGT were also faster to collect reward (task: F(1,28) = 7.643, p = 0.010), made more premature responses (task: F(1,28) = 5.649, p = 0.025), and tended to complete fewer trials (task: F(1,28) = 4.114, p = 0.052). Performance of either task variant was indistinguishable between TH:Cre Tg+ and Tg− rats (F < 1.912, p > 0.178); therefore, Tg status was excluded as a between-subjects measure for all remaining analyses.
Locomotor testing
Comparing locomotor activity before versus after rGT training, we observed that uncued rGT animals increased locomotor activity after cocaine, whereas this effect was absent in rats trained on the cued rGT (Fig. 5A: day × bin × task: F(5,140) = 2.794, p = 0.019; rGT: day: F(1,14) = 4.781, p = 0.046; cued rGT: day: F(1,14) = 0.370, p = 0.553). Neither group showed changes in locomotor activity after saline (Fig. 5D–F; day: F values < 0.529, p > 0.479).
The omnibus ANOVA also revealed subtle differences in risk-preference groups' ambulatory activity before and after training, most prominently in cued rGT rats (Fig. 5C; day × bin × task × risk-preference: F(5,140) = 2.270, p = 0.051; cued rGT only: day × bin × risk-preference: F(5,70) = 2.992, p = 0.017; treatment × bin × risk-preference: F(5,70) = 2.517, p = 0.04). Specifically, at baseline, cued rGT risk-preferring animals were less active than optimal decision-makers after cocaine (Fig. 5C; bin × risk-preference: F(5,70) = 2.21, p = 0.06; day × bin: F(5,70) = 2.682, p = 0.028) and saline (Fig. 5F; day × bin × risk-preference: F(5,70) = 2.609, p = 0.03). As such, animals that went on to become risk-preferring showed some signs of reduced locomotor activity in response to an injection. However, risk-preference did not account for the sensitization effect seen in these rats (Fig. 5B; day × risk-preference, day × bin × risk-preference, risk-preference F values < 1.094, p > 0.372). For all analyses, no other significant interactions were present by day, treatment, task, or risk-preference (F values < 2.856, p > 0.102).
Responding for CRf
There were no significant differences by task or risk-preference group during the conditioning period for number of nose-pokes after CS+ versus CS− presentation, latency to nose-poke, or collection latencies (data not shown, F values < 1.642, p > 0.211). During the operant test, animals responded more on the CS+ associated lever, suggesting sensitivity to CRf (Fig. 6A; lever: F(1,21) = 9.409, p = 0.006). Although optimal decision-makers appear to show greater responding for the CS+, there were no significant differences by task or by risk-preference in the number of lever presses made, the ratio of responding for the CS+ vs CS−, or the latency to nose-poke at the food tray after illumination of the CS+ or CS−, a proxy of “goal-tracking” behavior (Fig. 6B,C; latency data not shown; all F values < 2.111, p > 0.161). These data suggest neither risk-preference nor task experience influenced responding for CRf.
DA microdialysis
Histological analyses excluded 7 animals for incorrect placement of microdialysis probes (i.e., exclusively in the shell or the core; see Fig. 7A for placements, n = 22 included). As expected, cocaine significantly increased DA levels within the NAc compared with baseline and saline (Fig. 7B,C; treatment: F(1,24) = 76.839, p < 0.001; bin: F(1,38) = 10.547, p < 0.001; treatment × bin: F(2,36) = 6.909, p = 0.004). However, rats trained on the cued rGT exhibited significantly less DA efflux compared with uncued-rGT rats (Fig. 7B; task: F(1,20) = 4.598, p = 0.044; within-subjects interactions: F values < 1.064, p > 0.350). Given the group differences in basal DA levels, we also analyzed cocaine-induced increases in DA efflux as percentage change from baseline. Although basal levels of DA were relatively lower in cued rGT animals, these rats tended to show a greater proportional DA efflux compared with uncued-rGT animals, most visibly after cocaine (Fig. 7C; task: F(1,19) = 3.513, p = 0.076; treatment: F(1,19) = 48.288, p < 0.001; bin: F(2,33) = 8.909, p = 0.001; treatment × bin: F(2,34) = 6.916, p = 0.004; all interactions by task: F < 2.947, p > 0.102). Therefore, although rats trained on the cued rGT appeared to have lower levels of basal DA in the NAc, they may be more sensitive to the ability of cocaine to increase DA efflux.
Discussion
Here we show that training rats on a gambling-like schedule of reinforcement in which wins are accompanied by salient audiovisual cues results in behavioral and neurophysiological changes indicative of a proaddiction phenotype. Interestingly, the behavioral response to cocaine self-administration manifests differently in animals able to persist in making good choices despite the presence of the cues, versus those that prefer the risky options. Similar to the results observed previously with the uncued version of the rGT (Ferland and Winstanley, 2017), the decision making of risk-preferring rats became significantly and progressively more risky as cocaine self-administration was acquired. In addition to exhibiting greater levels of risky choice as a group, the proportion of risk-preferring rats is dramatically elevated in the cued rGT, with ∼59.3% of animals exhibiting this profile as opposed to 12.5% in the uncued task. We have shown previously that the degree to which cocaine self-administration potentiates risky choice predicts elevated drug-seeking in the incubation of craving model of relapse (Ferland and Winstanley, 2017). As such, training on the cued rGT enriches the population of animals putatively vulnerable to addiction. Cocaine self-administration did not increase preference for the risky options in optimal decision-makers, regardless of which task they were trained on. However, despite this apparent resilience to the negative sequelae of drug-taking on the choice process, optimal decision-makers trained on the cued rGT self-administered significantly more cocaine, again potentially indicative of greater addiction risk.
We originally hypothesized that persistent risky choice, particularly on the cued rGT, may exacerbate addiction vulnerability through sensitizing the DA system, in keeping with reports that both exposure to cues that predict uncertain rewards, and responding for probabilistic reinforcement, can enhance locomotor sensitization to amphetamine (Singer et al., 2012; Zack et al., 2014; Mascia et al., 2019). Our data instead suggest that risk-preferring rats have a hyposensitive DA system at the outset, in that these rats show a blunted locomotor response to an injection of cocaine. Although training on the uncued rGT resulted in robust locomotor sensitization to cocaine in all rats, this was absent in animals trained on the cued rGT. Furthermore, in vivo microdialysis confirmed that basal DA efflux within the NAc was lower in rats trained on the cued compared with the uncued rGT, whereas the proportional increase in DA detected after cocaine administration tended to be greater. Collectively, these data suggest that hypoactivity within the mesolimbic DA system may promote susceptibility to the addictive properties of psychostimulants, in keeping with the reward deficiency hypothesis of addiction (Blum et al., 2000), and that repeated engagement with heavily cued probabilistic reinforcement schedules can exacerbate this vulnerability. What remains unclear from the current data is whether this hypoactive state is specific to the NAc core or shell. DA tone within these subregions may differentially affect decision-making preferences (Sugam et al., 2012, 2014), and future studies should address this issue.
Both optimal decision-makers and risk-preferring rats trained on the cued rGT made more premature responses than their uncued counterparts, indicative of greater motor impulsivity, another cognitive behavioral trait linked to addiction risk. Animals that exhibit more premature responses at baseline self-administer cocaine in a more addiction-like manner, and accumbal D2/3 receptor expression is markedly lower in these rats (Dalley et al., 2007; Belin et al., 2008). Again, currently available data would predict that elevated premature responding would be associated with a hyperactive mesolimbic DA system. For example, potentiating DA release in the NAc increases this form of motor impulsivity, and the ability of amphetamine to drive premature responding can be attenuated by direct infusions of DA antagonists into the NAc (Pattij et al., 2007; Moreno et al., 2013). However, our data may not be entirely inconsistent with these observations. The addition of reward-concurrent cues to the rGT would be expected to potentiate the increase in DA release likely caused by responding for reward on probabilistic schedules of reinforcement. Greater phasic DA release on a background of low tonic DA levels is thought to have greater neurophysiological and behavioral effects (Grace, 1991), and this may result in elevated premature responding in the cued task.
The predicted ability of reward-concurrent cues to amplify task-induced DA release may also explain why accumbal DA levels are so much lower in animals trained on the cued rGT. Specifically, repeated exposure to heavily cued probabilistic reinforcement schedules on the cued rGT, which should drive dopaminergic activity, may lead to an adaptive downregulation of basal NAc DA release in a form of homeostatic regulation. This in turn may result in sensitization of postsynaptic receptors, such that the response to events that potentiate DA signaling is disproportionately amplified. Certainly, attenuating dopaminergic signaling through administration of DA receptor antagonists can sensitize or upregulate D2-family receptors in the striatum (Seeman, 2008; Varela et al., 2014), including the D3 receptors that uniquely mediate choice on the cued rGT (Sokoloff et al., 2006; Barrus and Winstanley, 2016). As such, rats trained on the cued rGT may be more motivated to respond for certain reinforcers as predicted by reward deficiency: animals are therefore primed to self-administer cocaine or to respond for larger rewards despite the potential negative consequences as smaller rewards are insufficient to motivate behavior (Blum et al., 2000, 2012). Indeed, recent data suggest that cocaine-dependent human subjects show blunted DA release within the ventral striatum, encompassing the NAc, after amphetamine administration, and have lower ratings of euphoria after amphetamine, yet are clearly more willing to self-administer cocaine (Trifilieff et al., 2017). Previous data also indicate that changes in DA signaling within the NAc can also influence preference for uncertain outcomes (St Onge et al., 2012; Sugam et al., 2012; Zalocusky et al., 2016), further emphasizing the potential synergy between the neurobiological mechanism impacted by addictive drugs and responding for uncertain outcomes.
The self-administration and CRf sessions were run in different operant chambers from the rGT and used distinct operant manipulanda and stimuli, but some features of the contexts and reward-paired cues were similar across paradigms, raising questions over whether learning generalized across tasks. However, responding for CRf was comparable regardless of risk-preference or whether animals performed the cued or uncued task variant, making it unlikely that learning from the context or set of cues used in the rGT impacted development of this behavior. Similarly, although a 20 kHz tone was used to signal delivery of cocaine and also featured in the reward-concurrent cues paired with one of the high-risk options (P4), there was no sign that preference for P4 affected cocaine-taking; levels of cocaine self-administration were comparable in risk-preferring rats trained on the cued rGT and rats trained on the uncued rGT. Similarly, optimal decision-makers performing the cued rGT took more cocaine but did not then shift their preference toward the riskier options.
The question remains as to why training on the cued task results in distinct behavioral responses to cocaine self-administration depending on baseline levels of risk preference: optimal decision-makers take considerably more cocaine, whereas risk-preferring rats make increasing numbers of risky decisions. If experience of cocaine exacerbates the subjective or behavioral response to low basal DA, as would be expected from the opponent process theory of addiction and supporting work, this may lead to increased drive to reach an optimal level of reward (Koob and Le Moal, 2008; Koob and Volkow, 2010; Der-Avakian and Markou, 2012; Belujon et al., 2016). Optimal decision-makers are immune to the allure of the risky, more heavily cued options on the rGT, and may therefore overcome this aversive state through taking more drug. Risk-preferring rats may instead seek to boost accumbal DA levels through making more risky choices. As such, the pursuit of large, risky rewards may almost “substitute” for cocaine self-administration in these animals. Alternatively, greater risky choice could result from insensitivity to punishment, similar to the “myopia for the future” previously hypothesized by others to hallmark decision making in addiction-vulnerable individuals (Bechara, 2005). These hypotheses remain speculative but may prove heuristically useful.
Poor cognitive performance after cocaine experience was limited to the decision-making domain in risk-preferring animals. All rats actually showed an improvement in impulse control, as indicated by a decrease in premature responding while self-administering cocaine. This is not the first time that self-administration of psychostimulants has been shown to decrease this form of motor impulsivity (Dalley et al., 2007; Caprioli et al., 2013), and may reflect the therapeutic benefit of stimulant medications in reducing impulsive responding. While clinical data clearly show that levels of impulsive choice (selection of smaller-sooner over larger-later rewards) and impulsive action are elevated in SUD (Dalley and Robbins, 2017), data from animal models may help to track variation in impulsivity across phases of drug use. Determining which behavioral trait is best predictive of problematic features of drug use, and at which stage of the addiction cycle, could be enormously beneficial in targeting appropriate interventions.
The null relationship between responding for CRf and task experience or risk-preference indicates that risky choice on either task is unlikely to simply reflect greater willingness to respond for conditioned reinforcing stimuli. In an incubation of craving paradigm, we previously observed that rats identified as risk-preferring on the uncued rGT exhibited greater responding for drug-paired cues after withdrawal from cocaine self-administration, suggesting a potential link between risky choice and cue responsivity (Ferland and Winstanley, 2017). Furthermore, a large body of literature has indicated an important relationship between cue sensitivity and/or sign-tracking and the development of an addictive-like behavioral pattern (Phillips and Fibiger, 1990; Robinson and Berridge, 1993, 2008; Tomie, 1996; Tomie et al., 2008; Flagel et al., 2009, 2010). The current results, however, suggest that there may be something unique about the incentive motivation elicited by drug-paired, rather than food-paired, cues with regards to the relationship with risky choice. The exact cognitive process through which win-paired cues drive risky choice remains unclear and is the focus of ongoing research efforts.
These data may have implications for our understanding of disordered gambling and its relationship to SUD. GD is epitomized by risky decision making, as the subject continues to play despite the danger of overwhelming financial loss (Clark et al., 2013; Petry et al., 2014). Similar to patterns of brain activation in response to drug-paired cues in SUD, cues associated with monetary rewards elicit ventral striatum and insula activity in problem gamblers (Sescousse et al., 2013; Limbrick-Oldfield et al., 2017), suggesting a common mechanism between craving for drugs and gambling. EGMs use salient audiovisual cues to encourage play, and may be particularly addictive compared with other forms of gambling (Clark et al., 2013; Murch and Clark, 2016). Given the ability of experience with reward-paired cues to both exacerbate risky decision making and to promote cocaine self-administration, even in those that are not biased toward risky choices, experience with EGMs may potentially sensitize those with GD to other forms of addiction through multiple insidious mechanisms. Indeed, in a representative survey of pathological gamblers, 73% had comorbid alcohol use disorder and nearly 40% exhibited other drug abuse (Petry et al., 2005). The role played by striatal DA signaling in GD, however, seems more complex than in SUD. While a lower density of D2/3 receptors within the ventral striatum has been repeatedly observed in SUD populations (Volkow et al., 2009), data from GD subjects have been more equivocal (Clark et al., 2012; Potenza and Brody, 2013; Sescousse et al., 2016). Although data suggest those with GD have elevated DA synthesis capacity within the ventral striatum (van Holst et al., 2018), this population also shows blunted ventral striatum activation during reward anticipation (Luijten et al., 2017). Nevertheless, the possibility that aspects of gambling products may facilitate cross-sensitization to psychostimulant drugs is an important consideration with respect to the regulation of EGM design and should be explored further.
In conclusion, these data demonstrate that pairing uncertain rewards with salient sensory cues not only increases risky choice, but fosters a hypodopaminergic state within the NAc. Understanding the mechanism through which appetitive conditioned stimuli can drive both risky decision making and addiction vulnerability will hopefully contribute to novel therapeutic interventions for both chemical and behavioral dependency. Addictions are often highly comorbid with other psychiatric disorders in which decision making is compromised, such as schizophrenia, major depression, and anxiety (Pasche, 2012; Thoma and Daum, 2013; McCabe et al., 2017; Gómez-Coronado et al., 2018), all of which have similarly been associated with aberrant DA signaling (Grace, 2016; Faivre et al., 2018; Hellberg et al., 2018). These findings may further clarify the neurocognitive processes underlying the synergy between these psychopathologies.
Footnotes
This work was supported by Canadian Institutes for Health Research Open Operating Grant to C.A.W. We thank Sukhbir Kaur for genotyping animals used in Experiment 2; Mason Silveira for assisting with breeding animals used in this work; and Giada Vacca for assistance with HPLC.
In the last 3 years, C.A.W. has been retained as an expert witness by Hogan Lovells LLP, and received due compensation. A.G.P. declares a patent related to glutamate receptor function (A Peptide that Specifically Blocks Regulated AMPA Receptor Endocytosis and Hippocampal CA1 Long-term Depression; European 04789721.0, and United States 13/066,700). A.G.P. also declares a pending patent for the use of d-Govadine in treatment of cognitive deficits. The remaining authors declare no competing financial interests.
- Correspondence should be addressed to Jacqueline-Marie N. Ferland at jacqueline-marie.ferland{at}mssm.edu or Catharine A. Winstanley at cwinstanley{at}psych.ubc.ca