Abstract
Similar to other addiction disorders, the cues inherent in many gambling procedures are thought to play an important role in mediating their addictive nature. Animal models of gambling-related behavior, while capturing dimensions of economic decision making, have yet to address the impact that these salient cues may have in promoting maladaptive choice. Here, we determined whether adding win-associated audiovisual cues to a rat gambling task (rGT) would influence decision making. Thirty-two male Long–Evans rats were tested on either the cued or uncued rGT. In these tasks, animals chose between four options associated with different magnitudes and frequencies of reward and punishing time-out periods. As in the Iowa Gambling Task, favoring options associated with smaller per-trial rewards but smaller losses and avoiding the tempting “high-risk, high-reward” decks maximized profits. Although the reinforcement contingencies were identical in both task versions, rats' choice of the disadvantageous risky options was significantly greater on the cued task. Furthermore, a D3 receptor agonist increased choice of the disadvantageous options, whereas a D3 antagonist had the opposite effects, only on the cued task. These findings are consistent with the reported role of D3 receptors in mediating the facilitatory effects of cues in addiction. Collectively, these results indicate that the cued rGT is a valuable model with which to study the mechanism by which salient cues can invigorate maladaptive decision making, an important and understudied component of both gambling and substance use disorders.
SIGNIFICANCE STATEMENT We used a rodent analog of the Iowa Gambling Task to determine whether the addition of audiovisual cues would affect choice preferences. Adding reward-concurrent cues significantly increased risky choice. This is the first clear demonstration that reward-paired cues can bias cost/benefit decision making against a subject's best interests in a manner concordant with elevated addiction susceptibility. Choice on the cued task was uniquely sensitive to modulation by D3 receptor ligands, yet these drugs did not alter decision making on the uncued task. The relatively unprecedented sensitivity of choice on the cued task to D3-receptor-mediated neurotransmission data suggest that similar neurobiological processes underlie the ability of cues to both bias animals toward risky options and facilitate drug addiction.
Introduction
Gambling disorder (GD), in which individuals lose control over their gambling behavior, leads to severe personal, social, and financial consequences. Estimates suggest that ∼2.5% of the general population meet the criteria for GD, with a further 4.9% exhibiting troubling yet preclinical symptomatology (Shaffer et al., 1999; Cunningham-Williams et al., 2005). A better understanding of the neuropathology underlying GD would be helpful in developing effective therapeutic interventions (Madden et al., 2007).
To this end, several behavioral procedures have been designed to evaluate gambling-like behaviors in laboratory animals. The rat gambling task (rGT; Zeeb et al., 2009) is loosely based on the Iowa Gambling Task (IGT) used clinically to assess decision making under uncertainty. In both tasks, subjects choose between four options, each resulting in distinct patterns of gains or losses according to probabilistic schedules with the goal of accruing reward. The best strategy is to favor options associated with smaller gains but also smaller penalties and to avoid the tempting “high-risk, high-reward” outcomes. Although such options can yield greater rewards per trial, the disproportionately larger punishments result in considerably less benefit over time.
Choice on the rGT and IGT is mediated by similar neural circuitry involving the medial prefrontal cortex, orbitofrontal cortex, and basolateral amygdala (Bechara et al., 1999, Fellows and Farrah, 2005; Zeeb et al., 2011, 2013, Paine et al., 2013). Surprisingly, despite substantial evidence implicating dopamine in reward-related behavior (Bergh et al., 1997; Shinohara et al., 1999; Zack et al., 2004; Dodd et al., 2005) and in the iatrogenic development of GD in parkinsonian patients (Weintraub et al., 2006), choice patterns on the rGT are not predominantly driven by the dopamine system. Amphetamine-induced choice impairments are not mediated by the psychostimulant's dopaminergic actions (Zeeb et al., 2013). Furthermore, neither administration of D1-like or D2-like agonists nor a selective dopamine reuptake inhibitor affected choice (Zeeb et al., 2009) (Baarendse et al., 2012).
Highly salient win-associated cues are a significant component of human gambling, yet these are notably absent in the rGT. Although cues predictive of reward increase the release of dopamine (Schultz, 1998), dopaminergic neurons appear to fire more strongly when reward delivery is probabilistic, with the greatest increase occurring when uncertainty is maximized (Fiorillo et al., 2003). Mice will also work for the presentation of complex, variable audiovisual cues in a dopamine-dependent manner somewhat similar to cocaine self-administration (Olsen et al., 2009). Sensitivity to the behavioral influence mediated by reward-paired cues has long been associated with vulnerability to drug addiction and the propensity to relapse (Everitt et al., 2000, Kruzich et al., 2001; Saunders et al., 2010) and may also mediate the transition from recreational to problem gambling (van Holst et al., 2012; Grant and Bowling, 2015). A robust demonstration of cue-induced maladaptive decision making would therefore be of value to the study of both gambling and substance use disorder, in addition to improving the construct validity of the rGT. We therefore hypothesized that adding win-related cues to the rGT, which increased in variety and complexity with reward size, would exacerbate risky decision making and enhance the role of the dopamine system in mediating choice.
Materials and Methods
Subjects
Subjects were 48 male Long–Evans rats (Charles Rivers Laboratories) weighing 250–275 g upon arrival at the animal facility. Animals were food restricted to 85% of free-feeding weight and maintained on a diet of 14 g of standard rat chow per day. Water was available ad libitum in home cages. Animals were housed in pairs and maintained in a climate-controlled colony room on a 12 h reverse light cycle (lights off at 8:00 A.M.). All experimental work was approved by the University of British Columbia's Animal Care Committee and husbandry was performed in accordance with the standards set forth by the Canadian Council of Animal Care.
Behavioral apparatus
Testing took place in 16 standard Med Associates 5-hole operant chambers housed in ventilated sound-attenuating cabinets. Each chamber featured a food tray outfitted with both a stimulus light and an infrared beam for detecting nose-poke inputs. Sucrose pellets (45 mg; Bio-Serv) could be delivered to this tray from an external food hopper and a house light was positioned on the chamber wall above. An array of five response apertures was located on the opposite wall, each equipped with stimulus lights and infrared beams for detecting input. The operant chambers ran according to MedPC programs authored by C.A.W. and controlled by an IBM-compatible computer.
Behavioral testing
Operant training.
Three groups of 16 rats were tested in series. Animals were initially habituated to the operant chambers over the course of two 30 min exposures during which sucrose pellets were placed in each of the apertures and animals were allowed to explore the apparatus. Animals were then trained on a variant of the five-choice serial-reaction time task (5-CSRTT) in which one of the five nose-poke apertures was illuminated for 10 s and a nose-poke response was rewarded with a single sucrose pellet delivered to the food magazine. The aperture in which the stimulus light was illuminated varied across trials. Each session consisted of 100 trials and lasted 30 min. Animals were trained on this task until responding reached 80% accuracy and <20% omissions. Once this training was complete, rats then performed a forced-choice variant of the rGT/cue preference task (CPT). This training procedure was designed so that animals were forced to respond an equal number of times to each aperture that would be used in the rGT (from left to right: 1, 2, 4, and 5) to ensure equal exposure to the contingencies associated with each hole and to minimize any potential primacy effects. For the two cohorts of 16 rats that would eventually perform the rGT, the contingencies on this task and presence/absence of cues were the same as those used in the full versions of the rGT (detailed below). For the third cohort of 16 animals that would eventually perform the CPT, the salient win-paired cues were identical to the cued rGT, but selection of any option was invariably rewarded with one sucrose pellet; there was no possibility of punishment.
rGT.
A task schematic is provided in Figure 1. Each trial began with the illumination of the tray light. A nose-poke response in the tray turned the tray light off and began a 5 s intertrial interval (ITI) during which all lights were extinguished and the animal had to refrain from responding in any of the apertures. After the ITI, cue lights in the response apertures 1, 2, 4, and 5 were illuminated by a solid cue light on each trial. A nose-poke response at an illuminated aperture was then either rewarded or punished according to the unique reinforcement schedule associated with that aperture (Fig. 1). If the response was rewarded, the aperture light would be extinguished, the tray light would be illuminated, and the appropriate number of sucrose pellets would be distributed. The animal's response in the tray extinguished the tray light and initiated a new trial. If the response at the array was punished, a time-out period commenced during which the selected aperture flashed at a rate of 0.5 Hz for the duration of the penalty time-out, after which the aperture light turned off, the tray light turned on, and the animal was able to initiate a new trial. If the rat responded in any aperture during the ITI, the trial was scored as a premature response and the house light illuminated to mark a 5 s time-out period during which the animal would be unable to register a response. This premature response measure is based on that provided by the 5-CSRTT, which was developed as a rodent analog of the continuous performance task (Beck et al., 1956), and is considered a well validated measure of motor impulsivity (Voon et al., 2014, Sanchez-Roige et al., 2014). At the end of the time-out period, the house light turned off, the tray light turned on, and the animal could begin a new trial.
Unlike tasks that use a block design (Evenden and Ryan, 1996), the reinforcement contingencies were kept constant throughout the session and animals were free to choose from any option on every trial. Previous analyses indicate that choice patterns remain constant throughout the session (Zeeb et al., 2009). The different schedules of reward and punishment associated with each aperture resulted in unequal return across each 30 min session. The optimal strategy was exclusive choice of P2, which would yield the maximal expected returns due to the relatively high probability of reward (0.8) and comparatively short (10 s) and infrequent (p = 0.2) time-out penalties. Although the return on individual winning trials was higher for options P3 and P4, the higher frequency and longer duration punishments associated with these options made their selection disadvantageous over time. Numerous behavioral tasks successfully use time-out periods as effective punishments in the shaping of behavior, including the 5-CSRTT and stop-signal paradigms (Carli et al., 1983, Eagle et al., 2008). We have shown previously that these delay periods are critical in attenuating choice of the options associated with larger but less frequent rewards (Zeeb et al., 2009). The position of each option was counterbalanced across animals to mitigate any potential thigmotaxis-mediated biases toward the holes on the far side of the array. Version A (n = 16) was arranged P1, P4, P2, and P3 from left to right and version B (n = 16) was arranged P4, P1, P3, and P2. A total of 16 animals were tested on this version of the task and an additional16 were tested on the cued rGT.
Cued rGT.
The structure of the cued rGT was identical to that of the original uncued rGT except for the introduction of audiovisual cues that accompanied reward delivery on winning trials and and varied in complexity across options. Comparable to the experience of human gambling games, the magnitude of win-associated cues became considerably larger as win size increased, as shown in Table 1. Before designing the cued rGT, we first used a simple flash frequency preference test to determine whether animals preferred slower versus faster frequencies of flashing light and could discriminate between them. Each trial consisted of a choice between 2 flashing apertures, the location (holes 1–5) and flash frequency (1–5 Hz) of which were determined at random. A nose poke in either of the illuminated apertures was always rewarded with delivery of a sugar pellet at the food tray. Animals (n = 16) showed a clear preference for cue lights flashing at higher frequencies (choice: F(4,56) = 12.71, p < 0.001 data not shown). Choice of the 3, 4, and 5 Hz options was significantly higher than choice of the 1 and 2 Hz options, so we chose to use visual cues that flashed at a frequency of 5 Hz combined with a sequence of auditory tones that changed every 0.2 s. Each reward-paired cue was concurrent with pellet delivery and lasted for 2 s in total, after which a new trial could be initiated. On a rewarded P1 trial, the corresponding aperture flashed at 1 Hz and the tray light was solidly illuminated. A single tone played concurrently with the flashing cue light. Likewise, a rewarded P2 trial was marked by the cue light in the corresponding aperture flashing at a rate of 1 Hz and the tray light was again solidly illuminated. A win on P2 was also marked by a tone sequence composed of 2 distinct 1 s tones delivered in the same order on each trial. The cues associated with the larger rewards were more complex and variable, consistent with observations that rodents find such cues appetitive (Olsen et al., 2009). The 6 tones used were as follows: 4, 8, 10, 12, 15, and 20 kHz. We have successfully used these tones as discriminative stimuli in other behavioral procedures (Winstanley et al., 2011, Rogers et al., 2013. Using the letters A–F to represent a different tone, the patterns for P3 were as follows: CDEDCDEDCD; CECEDEDECE. Similarly, the patterns for P4 were as follows: ABCDEFEDCB; BCDCDEDEFE; CEDFCEBDAC; FEDCBAFEDC. With respect to the visual cues, the first light to flash was the hole associated with that response. For P3 and P4, the visual stimuli then became more varied in the last second of the cue, using sequences of multiple lights that change in sync with the tones. Lights could be illuminated together (as indicated by numbers in brackets) or independently. The following numbers correspond to the aperture, numbered from left to right of the operant box. The patterns for P3 were as follows: 5434543454; (5 + 3)4(5 + 3)4(5 + 3)4(5 + 3)4(5 + 3)4. The patterns for P4 were as follows: 1234543212; (2 + 4)(1 + 3 + 5)(2 + 4)(1 + 3 + 5)(2 + 4)(1 + 3 + 5)(2 + 4)(1 + 3 + 5) (2 + 4)(1 + 3 + 5); 1324354231; 3(2 + 4)(1 + 5)(2 + 4)3(2 + 4)(1 + 5)(2 + 4)3(2 + 4). The tone/light pattern played on each winning trial was determined randomly, but no pattern was presented on sequential trials. The tray light also flashed at a frequency of 5 Hz in conjunction with the array lights and tones.
CPT.
To control for the specific contributions of salient cues alone to choice behavior regardless of any conditioned associations between the cues and particular outcomes, an additional 16 animal cohort was trained on the CPT. This task was identical to the cued rGT except that a response to any of the four apertures was rewarded with a single sucrose pellet on an FR1 schedule with no possibility of punishment. The win-related cues were identical to those on the cued rGT. For example, whereas selection of the P4-designated aperture on the CPT results only in one sucrose pellet, the delivery of this reward was still paired with the complex P4-associated win cues. This relationship is consistent across all options. As on both versions of the rGT, the location of each option was counterbalanced across the group. Version A (n = 8) was arranged P1, P4, P2, and P3 from left to right and version B (n = 8) was arranged P4, P1, P3, and P2.
Drugs
Pharmacological manipulations began once animals had achieved stable baseline responding, defined as a nonsignificant effect of session and choice × session interaction on a repeated-measures ANOVA across the previous three sessions. All drugs were prepared fresh daily and the order in which doses were administered was determined by a Latin-square design. Each drug was administered in 3 d cycles; the first day was a baseline session, the second a drug administration day, and the third a rest day in which animals were not tested and remained in the home cage. Drugs were administered 10 min before the start of behavioral testing, except for SB-277011-A, which was administered 30 min before testing. To prevent any potential carryover effects, animals were given a washout period between drugs of at least 1 week. During this period, they were tested on the task.
Drug doses are provided in Table 2. Doses and routes of administration were based on previous reports (Zeeb et al., 2009; Cocker et al., 2014). All doses were calculated as the salt. d-amphetamine sulfate, eticlopride, PD-168077, PD128907, and SB-277011-A were purchased from Sigma-Aldrich. A-381393 was a gift from Dr. Anton Pekcec of Boehringer Ingelheim. All drugs were delivered via intraperitoneal administration, with the exception of PD128907, which was delivered subcutaneously. d-amphetamine sulfate, eticlopride, PD-168077, and PD128907 were dissolved at a volume of 1 ml/kg in 0.9% sterile saline. A-381393 was dissolved in a solution of 40% 0.1 m hydrochloric acid. SB-277011-A was dissolved in a solution of sterile water and 10% w/v (2-hydroxypropyl)-β-cyclodextrin. The order of administration in the first cohort of rats performing the rGT was as follows: d-amphetamine, eticlopride, PD-168077, and A-381393. The order of administration in the second cohort of rats performing the rGT was d-amphetamine, eticlopride, PD128907, and SB-277011-A. The primary purpose of the CPT was to determine to what degree the behavioral effects of adding cues to the rGT reflected a simple preference for the more complex audiovisual cues, as opposed to any conditioned associations formed between those cues and probabilistic delivery of larger rewards. Therefore, we only tested compounds on the CPT that selectively affected performance of the cued rGT, namely the D3 receptor ligands. The order of administration in this third cohort was therefore PD128907, SB-277011-A.
Behavioral measurements
The primary dependent variable, choice of each individual option, was calculated as [(all choices of a given option)/(total trials completed)] * 100. Calculating choice preference as a percentage rather than as a raw count controlled for differences in total trials executed across sessions and between animals (Zeeb et al., 2009). A measure termed the “score variable” was developed to communicate to what extent an animal's choice was optimal. As is often used to represent data obtained from the IGT (Bechara et al., 1999), the score variable was defined as the difference between choice of the advantageous options and the disadvantageous options and was calculated according to the following formula: [(choice of P1) + (choice of P2)] − [(choice of P3) + (choice of P4)] (Zeeb et al., 2011). A positive score indicated that the rat had adopted the optimal choice strategy favoring the advantageous options, whereas a negative score indicated a net preference for the high-risk, disadvantageous options.
As described previously, any response made during the ITI was scored as a premature response, and these were calculated as [(total premature responses)/(total trials initiated)] * 100. As with choice preference, this formula yielded a percentage score. Latency to choose an option was calculated as the time between the end of the ITI and a response in any of the apertures. Latency to collect reward was calculated as the time between reward delivery and the animal's subsequent nose-poke response in the tray. Both choice and collection latency were averaged across session for each option. Behavioral testing continued until statistically stable performance was established, defined as no main effect of session or choice × session interaction term when analyzing data from 3 consecutive days.
Data analysis
All data analysis was performed with SPSS version 22.0.0 for Mac (IBM). Percentage variables were arcsine transformed to minimize artificial ceiling effects. Significance was set at the p < 0.05 level for all data analysis. Repeated-measures ANOVAs were used to analyze data, with choice (four levels, P1–P4), session, and drug dose (four levels, vehicle + three doses of drug) as within-subjects factors and group as a between-subjects factor. One animal in the uncued group was excluded from all analyses due to unresolved behavioral instability.
Results
Baseline choice behavior
Both groups trained on versions of the rGT reached behavioral stability at the same time point (sessions 35–37; session × choice: F(6,174) = 0.39, p = 0.885; session × choice × cue: F(6,174) = 1.03, p = 0.409). Animals performing the cued rGT demonstrated a significantly more disadvantageous choice preference compared with those trained on the uncued rGT (Fig. 2A; choice × group: F(3,87) = 4.12, p = 0.009). On average, rats performing the cued task exhibited a reduced preference for P2, the best option, and also chose P3, one of the disadvantageous options, more frequently (group P2: F(1,29) = 6.44, p = 0.017, group P3: F(1,29) = 7.89, p = 0.009). No other behavioral measures differed significantly between groups (all F ≤ 3.32, all p ≤ 0.079). All animals trained on the CPT reached behavioral stability at the same time point (sessions 3–5; session × choice: F(6,90) = 1.31, p = 0.258). In contrast to the rGT and the cued rGT, animals performing the CPT did not demonstrate a significant preference for any option and sampled fairly equally between all 4 holes, although choice of the option associated with the most complex cue was greatest (Fig. 2B; choice: F(3,45) = 1.49, p = 0.231).
There were no changes in choice behavior once animals achieved stability. Behavior at baseline was compared with behavior during the vehicle injection day for each drug; no significant differences in choice behavior were found (vehicle: all F ≤ 3.08, all p ≥ 0.101; vehicle × choice: all F ≤ 1.99, all p ≥ 0.12). Additional behavioral measures are provided in Table 3.
Amphetamine
Consistent with previous reports, amphetamine increased choice of P1 and decreased choice of P2 in the uncued group (Fig. 3; dose × choice: F(9,126) = 5.58, p < 0.001, saline vs 0.3 mg/kg: F(3,42) = 5.44, p = 0.003, saline vs 1.0 mg/kg: F(3,42) = 8.39, p < 0.001, saline vs 1.5 mg/kg: F(3,42) = 10.76, p < 0.001), whereas there was no significant effect of drug on choice behavior in the cued group (Fig. 3; dose × choice: F(9,135) = 1.177, p = 0.314).
The highest dose of amphetamine increased choice latency in the cued group and, although there was an overall significant effect of dose in the uncued group, no specific dose had a significant effects on this measure compared with vehicle (dose uncued: F(3,42) = 3.38, p = 0.027; cued: F(3,45) = 3.84, p = 0.016; saline vs 1.5 mg/kg: cued F(3,45) = 3.84, p = 0.016, 1.23 ± 0.20 vs 1.67 ± 0.25). A robust increase in premature responding was observed in both cohorts (dose uncued F(3,42) = 2.39, p = 0.049; cued: F(3,87) = 7.96, p < 0.001; saline vs 0.3 mg/kg uncued F(3,42) = 2.39, p = 0.049, 27.19 ± 4.02 vs 41.34 ± 3.83; cued F(3,45) = 6.44, p = 0.001, 25.94 ± 4.31 vs 44.81 ± 4.61) and omissions were increased by amphetamine in the cued group (dose uncued F(3,42) = 1.34, p = 0.275; cued: F(3,45) = 3.44, p = 0.025, saline vs 1.5 mg/kg: F(3,45) = 3.84, p = 0.016, 0.75 ± 0.23 vs 5.13 ± 2.23). No other variable was affected by the drug (all F < 2.81, NS).
Eticlopride
In contrast to previous reports (Zeeb et al., 2009, 2013), the D2 antagonist eticlopride did not improve performance by increasing choice of the best option in the uncued procedure and a similar null effect on choice behavior was observed in the cued version of the task (dose uncued F(3,45) = 1.10, p = 0.358; cued F(3,39) = 1.23, p = 0.310; dose × choice uncued; F(9,135) = 0.89, p = 0.529; cued: F(9,117) = 1.07, p = 0.391).
Eticlopride increased the latency to make a choice (dose uncued: F(3,45) = 6.02, p = 0.002; Cued: F(3,39) = 7.75, p < 0.001; saline vs 0.06 mg/kg: uncued F(1,15) = 10.04, p = 0.006, 0.94 ± 0.08 vs 1.48 ± 0.16; cued: F(1,14) = 15.36, p = 0.002, 0.96 ± 0.14 vs 1.53 ± 0.19) and decreased premature responses (dose: uncued F(3,45) = 5.54, p = 0.003; cued: dose: F(3,39) = 7.725, p < 0.001; saline vs 0.06 mg/kg uncued: F(1,15) = 8.49, p = 0.011, 23.36 ± 3.23 vs 13.89 ± 2.52; cued: F(1,14) = 13.48, p = 0.003; 27.48 ± 3.90 vs 14.71 ± 2.56). Eticlopride also slightly increased omissions in the uncued group (dose: uncued: F(3,45) = 3.90, p = 0.015; saline vs 0.06 mg/kg: F(1,15) = 4.82, p = 0.044; 0.27 ± 0.15 vs 2.60 ± 1.05; cued: F(3,39) = 0.92, p = 0.438), collectively indicative of motor slowing at higher doses, as expected after higher doses of dopamine antagonist administration. There were no effects on any other behavioral measure (all F ≤ 1.44, p ≥ 0.243).
PD128907
rGT and cued rGT
The highest dose of the D3 agonist PD128907 increased choice of the risky, disadvantageous option P3, but only in the cued group (Fig. 4A,B, Table 4; dose uncued: F(3,21) = 1.94, p = 0.154; cued: F(3,21) = 3.32, p = 0.039; saline vs 5 mg/kg: dose: F(1,7) = 6.628, p = 0.04).
Although PD128907 had no effect on premature responses in the cued group, the lowest and highest dose increased and decreased this form of impulsivity respectively in the uncued group (dose uncued: F(3,21) = 8.58, p = 0.001; saline vs 0.01 mg/kg- F(1,7) = 9.05, p = 0.020, mean ± SEM: 14.36 ± 3.23 vs 22.63 ± 4.51; saline vs 0.1 mg/kg- F(1,7) = 11.36, p = 0.012, mean ± SEM: 14.36 ± 3.23 vs 10.56 ± 4.27; cued: F(3,21) = 0.95, p = 0.43). Choice latency, collection latency and omissions were also increased in the uncued group, but not the cued group, although no individual dose was significantly different from saline (dose: choice latency, uncued: F(3,21) = 3.59, p = 0.031, mean ± SEM: saline: 0.95 ± 0.21, 0.01 mg/kg: 0.94 ± 0.23, 0.03 mg/kg: 0.98 ± 0.19, 0.1 mg/kg: 1.13 ± 0.10; cued: F(3,21) = 0.79, p = 0.512; collection latency, uncued: F(3,21) = 3.09, p = 0.049, mean ± SEM: saline: 1.00 ± 0.10, 0.01 mg/kg: 0.99 ± 0.12, 0.03 mg/kg: 1.01 ± 0.13, 0.1 mg/kg: 1.11 ± 0.17; cued: F(3,21) = 0.77, p = 0.525; omissions, uncued: F(3,21) = 3.27, p = 0.041; mean ± SEM: saline: 0.00 ± 0.00, 0.01 mg/kg: 0.25 ± 0.25, 0.03 mg/kg: 0.5 ± 0.38, 0.1 mg/kg: 0.71 ± 0.18; cued: F(3,21) = 1.93, p = 0.156). Trials completed were not affected by the drug (uncued: F(3,21) = 0.59, p = 0.63; cued: F(3,21) = 0.93, p = 0.443).
CPT
PD128907 did not affect choice behavior in the CPT (Fig. 4C, Table 4; dose: F(3,42) = 1.93, p = 0.14; dose × choice: F(9,126) = 0.00, p = 0.96). Similar to results from the uncued rGT, the highest dose decreased premature responses (dose: F(3,45) = 6.74; p = 0.001; saline vs 0.1 mg/kg: F(3,45) = 15.78, p = 0.001, mean ± SEM: 6.62 ± 1.85 vs 1.32 ± 0.43). Choice latency also increased (dose: F(3,42) = 4.23, p = 0.01), again, an effect driven by a decrease at the highest dose (saline vs 0.1 mg/kg: F(1,14) = 5.52, p = 0.034, mean ± SEM: 2.26 ± 0.30 vs 2.93 ± 0.31). No other behavioral measures were affected (all F ≤ 2.79, all p ≥ 0.072).
SB277011-A
rGT and cued rGT
The lower doses of the D3 antagonist SB277011-A had the inverse pattern of effects to PD128907 in the cued group, decreasing choice of the disadvantageous option P3, yet was without effect in the uncued group (Fig. 4D,E, Table 4; dose: uncued: F(3,18) = 0.08, p = 0.969; cued: F(3,21) = 3.07, p = 0.05; saline vs 0.5 mg/kg: F(1,7) = 6.78, p = 0.035; saline vs 1.5 mg/kg: dose × choice F(1,7) = 4.81, p = 0.01). The drug did not affect any other behavioral measures (all F ≤ 2.73, p ≥ 0.070).
A-381393
A-381393, a selective D4 receptor antagonist, did not significantly affect choice in either group (dose: uncued: F(3,21) = 0.254, p = 0.254; cued: F(3,21) = 0.312, p = 0.817; dose × choice: uncued: F(9,637) = 1.22, p = 0.299; cued: F(9,63) = 1.88, p = 0.072) or any other behavioral measures (all F ≤ 2.69, all p ≥ 0.060).
PD-168077
The D4 receptor agonist PD-168077 did not affect choice behavior in either the cued or uncued groups (dose: uncued: F(3,21) = 0.256, p = 0.856; cued: F(3,21) = 0.15, p = 0.929; dose × choice: uncued: F(9,63) = 0.56, p = 0.828; cued: F(9,63) = 0.44, p = 0.911). All other behavioral measures were likewise unaffected (all F ≤ 2.03, all p ≥ 0.092).
Discussion
This work provides the first clear demonstration in an animal model that salient, audiovisual win-related cues are sufficient to enhance choice of riskier, more disadvantageous options, thereby modeling the negative impact that such cues may have on human choice. Furthermore, the presence of such cues alters the way in which certain dopaminergic ligands impact decision making. Choice on the cued task appears uniquely sensitive to modulation by D3 receptor drugs; the agonist PD128907 increased choice of one of the high-risk options, whereas the D3 antagonist SB277011-A had the opposite effect. These compounds did not affect choice in the uncued procedure nor the CPT, suggesting that cue-biased risky choice can be pharmacologically dissociated from both the process of discriminating between options associated with probabilistic reinforcement schedules and from simply responding for cue-paired rewards. In contrast, amphetamine only drove a risk-averse shift away from P2 and toward P1 in the uncued task. Numerous studies specifically implicate D3 receptors in mediating the maladaptive influence of cues in substance use disorder and recent data posit a critical role for this receptor subtype in GD (Boileau et al., 2014; Lobo et al., 2015). The cued rGT may therefore provide a novel method to determine empirically the degree to which cue sensitivity can promote poor choice in a cost/benefit model in a manner central to the addiction process.
Comparable null effects were observed across both task versions after administration of eticlopride. Previous publications either likewise report no effect of D2-like antagonists on-task (Paine et al., 2013) or observed a small increase in preference for the most optimal P2 choice (Zeeb et al., 2009, 2013). The reasons for these discrepant results are unclear, but collectively indicate that D2 receptor blockade does not have robust effects on choice behavior. The selective D4 agonist and antagonist were also equally ineffective at modulating performance of either rGT version. Although D4 agents have not typically resulted in significant behavioral effects on a variety of cognitive procedures (Oak et al., 2000, Le Foll et al., 2009), this receptor subtype has been implicated in some aspects of addiction (Di Ciano et al., 2014) and in the attribution of incentive salience to subthreshold environmental stimuli during fear conditioning (Lauzon et al., 2009). We also found that D4 receptor ligands modulated the erroneous expectation of reward on a rat slot machine task (rSMT), in which the animal must correctly interpret a series of cue lights as being indicative of a win or loss to optimize reward earned (Cocker et al., 2014). In the rSMT, the cues are present during the selection and initiation of the operant response and, as per fear conditioning procedures, the cues are truly predictive of an outcome. In contrast, on the rGT, the cues are instead delivered after the choice has been made and only when the outcome involves delivery of reward. Cue presentation is therefore reward concurrent rather than reward predictive and may therefore influence choice via an alternative mechanism.
It is worth considering whether the cues inhibited learning rather than biased “informed” choice, perhaps by confusing or distracting the animal. However, animals trained on both the cued and uncued versions of the task developed stable choice preferences within the same time frame. Similar to behavior on the uncued task, animals performing the cued rGT exhibited clear preference for one of the four options, indicating that behavior is unlikely to be driven by random sampling. It thus appears that the cues neither enhanced nor impaired acquisition of the task, but simply drove preference for riskier outcomes. Animals on both tasks also made comparable numbers of premature responses and omissions and latencies to choose an option and collect any resulting reward did not differ across versions. It is therefore difficult to attribute the increase in risky choice on the cued task to general changes in motivation, task engagement, or a lack of awareness of the reward contingencies in play. As shown previously, choice could also be modulated independently from other behavioral variables, suggesting somewhat dissociable pharmacological regulation of these distinct aspects of performance (Zeeb et al., 2009, 2013; Silveira et al., 2015).
The question then remains as to the cognitive and neurobiological mechanisms by which reward-paired cues elicit such a shift in choice behavior. Given the numerous reports demonstrating that amphetamine potentiates the behavioral influence of reward-paired cues, the fact that amphetamine does not potentiate cue-induced risky choice appears to be something of an anomaly. For example, amphetamine increases responding for reward-paired cues in tests of conditioned reinforcement (Hill, 1970; Robbins, 1976), enhances Pavlovian approach to reward paired cues in sign-tracking procedures (Hitchcott et al., 1997; Phillips et al., 2003), potentiates cue-induced relapse to drug seeking (Saunders et al., 2013), and also enhances the influence of reward-paired cues in a delay-discounting task (Cardinal et al., 2001). Amphetamine also increases choice of larger uncertain options in other rodent tasks (St. Onge and Floresco, 2009; Cocker et al., 2012). If win-paired cues were enhancing risky choice through their ability to act as traditional conditioned reinforcers or Pavlovian incentive stimuli, then one would expect amphetamine to potentiate this cue-induced shift in preference toward P3. In contrast, amphetamine increased choice of P1 and decreased choice of P2 in the uncued task, consistent with previous reports, yet was without significant effects on the cued procedure.
A key difference between the rGT and the other behavioral procedures listed above is that a failure to win is explicitly punished by a signaled time-out, heavily cued by a flashing stimulus light. We originally postulated (Zeeb et al., 2009) that amphetamine's ability to potentiate the behavioral influence of cues associated with aversive events (Killcross et al., 1997) led rats to favor the option associated with the shortest and least frequent penalties, P1. The null effect of amphetamine in the cued group may therefore arise because the drug-induced increase in the motivational salience of the win-paired cues competed with, and subsequently mitigated, the behavioral impact of the loss-related cues such that they were no longer sufficient to shift preference toward P1. Although speculative, such a hypothesis remains open to empirical verification in future studies.
Beyond the marked difference in baseline behavior, the most striking distinction between the cued and uncued task is the degree to which risky choice is modulated by D3 ligands in the former, but not the latter. There was no significant preference for any of the cues in the CPT, nor did D3 ligands modulate choice on this simple task, suggesting that these compounds do not simply augment or diminish any affective value ascribed to the cues themselves. Numerous studies have implicated D3 receptor signaling in the behavioral manifestation of drug addiction across a wide range of abused substances. Recent syntheses of the current literature indicate that D3 receptors may play a particular role in mediating the effect of drug-paired cues on behavior; not only are CPP and cue-induced reinstatement robustly attenuated by D3 antagonists, but D3-selective compounds have much greater effects on responding for drug under second-order schedules of reinforcement and higher FR schedules, in which cues play a clearer role in supporting operant behavior, than on simpler FR1 or FR2 schedules (Beninger et al., 2008, Le Foll et al., 2005). However, although there is a relative paucity of data from studies that used non-drug-unconditioned stimuli, the consensus appears to be that D3 agonists and antagonists have little to no effect on such responding. For example, SB-270110-A did not affect responding on a second-order schedule for sucrose reinforcement (Di Ciano et al., 2003). Although higher doses of a less-selective D3 agonist increased responding for food-paired CRf, this dose likely acts at D2 receptors (Sutton et al., 2001).
Whereas the D2 receptor is expressed fairly ubiquitously in sites innervated by dopamine, the D3 receptor is concentrated within the nucleus accumbens (NAc), islands of Calleja, and limbic structures such as the hippocampus and amygdala (Bouthenet et al., 1991, Lévesque et al., 1992). The NAc, lateral habenula, and central and basolateral amygdala have been identified as key sites at which D3 receptors modulate behavioral models of drug addiction, although whether the same neural circuitry is involved in the modulation of cue-driven risky choice by D3 ligands remains to be determined (Le Foll and Di Ciano, 2015). A history of prior cocaine self-administration can enhance behavioral reactivity to D3 ligands (Blaylock et al., 2011). It has also been suggested that repeated experience of a CS that predicts reward with maximal uncertainty (50%) or responding for unpredictable reinforcement under variable rather than fixed ratio schedules can sensitize dopamine release (Zack et al., 2014, Singer et al., 2012). Given that the cues facilitated choice of the option associated with maximal uncertainty on the rGT (P3; 50% chance of three sugar pellets), the effects of the D3 receptor agents may reflect long-term alterations in the sensitivity of the DA system caused by repeated choice of options with the most uncertain outcome rather than modulation of cue-related behavior per se. This tentative hypothesis appears to be supported by the null effects of D3 manipulations on choice in the CPT. If this is the case, then D3 agents should also modulate choice in rats exhibiting high levels of risky choice on the uncued rGT. Attempts to confirm this may be limited by the fact that so few animals prefer the risky options at baseline. However, in as much as we were able to determine within the cohorts tested here, the magnitude of the behavioral change caused by D3 ligands did not track the strength of the preference for P3 in the cued or uncued groups.
In sum, these data demonstrate that the addition of reward-paired cues to a rodent model of gambling-related decision making substantially increases maladaptive, risky choice. The presence of cues also enhanced the role that D3-receptor-mediated signaling played in regulating choice behavior. This receptor subclass has been strongly implicated in addiction and D3-selective agents rarely modulate behavior supported by standard nutritional reinforcers. The cued rGT may therefore be relatively unique in its ability to capture decision-making deficits representative of those seen in addiction disorders and underpinned by similar neurobiological processes. Therefore, drugs that can improve decision making on this task may have significant clinical benefit in remedying the disordered decision making central to the maintenance of the addicted state, which remains one of the most problematic and intractable features of behavioral and chemical dependency.
Footnotes
This work was funded by a Canadian Institutes for Health Research (CIHR Operating Grant to C.A.W.). C.A.W. received salary support through the Michael Smith Foundation for Health Research and the CIHR New Investigator Program.
C.A.W. has previously consulted for Shire on an unrelated matter. M.M.B. declares no competing financial interests.
- Correspondence should be addressed to either of the following: Michael M. Barrus, Djavad Mowafaghian Centre for Brain Health, University of British Columbia, 2215 Wesbrook Mall, Vancouver, British Columbia V6T 1Z3, Canada, michaelbarrus{at}psych.ubc.ca; or Catharine A. Winstanley, Department of Psychology, 2136 West Mall, University of British Columbia, Vancouver, British Columbia V6T1Z4, Canada, cwinstanley{at}psych.ubc.ca