Abstract
The functionally selective 5-HT2C receptor ligand SB242084 can increase motivation and have rapid onset anti–depressant-like effects. We sought to identify the specific behavioral effects of SB242084 treatment and elucidate the mechanism in female and male mice. Using a quantitative behavioral approach, we determined that SB242084 increases the vigor and persistence of goal-directed activity across different types of physical work, particularly when work requirements are demanding. We found this influence of SB242084 on effort, rather than reward to be reflected in striatal DA measured during behavior. Using in vivo fast scan cyclic voltammetry, we found that SB242084 has no effect on reward-related phasic DA release in the NAc. Using in vivo microdialysis to measure tonic changes in extracellular DA, we also found no changes in the NAc. In contrast, SB242084 treatment increases extracellular DA in the dorsomedial striatum, an area that plays a key role in response vigor. These findings have several implications. At the behavioral level, this work shows that the capacity to work in demanding situations can be increased, without a generalized increase in motor activity or reward value. At the circuit level, we identified a pathway restricted potentiation of DA release and showed that this was the reason for the increased response vigor. At the cellular level, we show that a specific serotonin receptor cross talks to the DA system. Together, this information provides promise for the development of treatments for apathy, a serious clinical condition that can afflict patients with psychiatric and neurological disorders.
SIGNIFICANCE STATEMENT Motivated behaviors are modulated by reward value, effort demands, and cost-benefit computations. This information drives the decision to act, which action to select, and the intensity with which the selected action is performed. Because these behavioral processes are all regulated by DA signaling, it is very difficult to influence selected aspects of motivated behavior without affecting others. Here we identify a pharmacological treatment that increases the vigor and persistence of responding in mice, without increasing generalized activity or changing reactions to rewards. We show that the 5-HT2C-selective ligand boosts motivation by potentiating activity-dependent DA release in the dorsomedial striatum. These results reveal a novel strategy for treating patients with motivational deficits, avolition, or apathy.
Introduction
Motivation, defined as the energizing of behavior in the pursuit of a goal, results from a complex analysis of the costs and benefits associated with that behavior (Rangel and Hare, 2010; Simpson and Balsam, 2016). Motivational processes are regulated by the DA system (Salamone and Correa, 2012; Salamone et al., 2016a) and are disrupted in several psychiatric and neurological disorders. Patients with schizophrenia, depression, post-traumatic stress disorder, anxiety disorders, and Parkinson's disease experience blunted motivation, impacting their functioning and quality of life (Strauss et al., 2013; Pagonabarraga et al., 2015; Fervaha et al., 2016). Currently, there are no treatments for motivational deficits. Direct targeting of the DA system is not a suitable approach because it carries potential for addiction.
In mice, the functionally selective 5-HT2C receptor (5-HT2CR) ligand SB242084 robustly enhances motivation in multiple operant tasks (Simpson et al., 2011; Avlar et al., 2015; Bailey et al., 2016) and also induces fast-onset antidepressant-like effects (Opal et al., 2014). The serotonergic and dopaminergic systems interact via 5-HT2CRs (De Deurwaerdère and Di Giovanni, 2017). SB242084 increases the activity of midbrain DA neurons in brain slices (Di Matteo et al., 1999) and augments stimulated DA release in anesthetized rats (Navailles et al., 2004). We therefore hypothesized that 5-HT2CR modulation enhances motivation by potentiating DA release.
Two major pathways of the DA system, the mesoaccumbal projection from the VTA to the NAc and the nigrostriatal projection from the substantia nigra (SN) to the dorsal striatum (DS), are each involved in both reward-related and effort-related aspects of behavior.
Activity in the VTA and DA release in the NAc correlates with reward magnitude (Gan et al., 2010), temporal proximity to reward (Takahashi et al., 2016), and reward prediction error (Schultz et al., 1997; Hart et al., 2014). Response costs are also reflected in the NAc (Gan et al., 2010; Wanat et al., 2010), and this is where cost-benefit computations can guide decision-making (Phillips et al., 2007; Hamid et al., 2016). DA depletion or DA receptor antagonism in the NAc can alter animal's effort allocation, biasing them away from highly rewarding effortful actions, toward lower effort options (Nunes et al., 2013). In the intact NAc, cue-evoked activity can encode the probability of responding to the cue, as well as motor properties of the responses (Morrison et al., 2017).
Neurons in the DS also encode reward prediction error (Oyama et al., 2010), the value of actions (Samejima et al., 2005; Lau and Glimcher, 2007), and value-based action selection (Howard et al., 2017). There are also subpopulations of neurons in the DS that are not involved in deciding whether or not to make a particular response, but instead guide the dynamics of behavioral output (Panigrahi et al., 2015; Bartholomew et al., 2016; Yttri and Dudman, 2016). An important component of motivation is the energizing of behavior (Salamone et al., 2016b). For example, greater motivation will often be manifest in shorter latencies to start responding, faster responding, more forceful responding, responding for longer durations, and persisting in an action in the face of disruptive circumstances.
Here we examined the effects of SB242084 on both effort-related and reward-related behavior and monitored extracellular DA to determine whether the mesoaccumbal or nigrostriatal pathways were selectively involved. SB242084 does not alter reaction to rewards but changes the dynamics of effort-related responding. DA release in the NAc in response to rewards or reward-predicting cues is unaffected by SB242084 treatment. Instead, during an effort-based choice task, SB242084 increased extracellular DA levels selectively in the dorsomedial striatum. This increase in striatal DA was not observed when animals were treated with the drug but were not working to earn rewards, and is thus the result of an interaction between drug treatment and effortful activity. Infusion of a nonselective DA antagonist into the DMS confirmed our hypothesis that SB242084 increases response vigor by potentiating DA release in this region of the striatum.
Materials and Methods
Experimental design and statistical analysis
The design for each experiment is provided in Results for each experiment, including the within- and between-subjects factors and a full description of critical variables required for independent replication. Complete results of the statistical test performed are reported for each experiment in Results. t tests, two-factor ANOVA (with and without repeated measures, as appropriate), and log-rank tests were performed using GraphPad Prism, version 7. Three-factor ANOVA was performed using R Statistical Package. For all statistical analyses, a significance criterion of p < 0.05 was adopted and the precise p values obtained are presented for each test in Results.
Subjects
For the fast scan cyclic voltammetry (FSCV) experiments, male mice on a C57BL/6J:129SvEv/ Tac F1 background were used. For all other experiments, male or female C57BL/6J mice (RRID:IMSR_JAX:000664; https://www.jax.org/strain/000664; The Jackson Laboratory) were used. We previously determined that the drug treatment under study (0.75 mg/kg SB242084) robustly increases motivated behavior in both strains of mice and both sexes (Simpson et al., 2011; Bailey et al., 2016). The sex and number of animals used are provided in the description of each experiment. All mice were 10–12 weeks old at the start of the experiments. All subjects were maintained at 85% of their ad libitum body weight to motivate them for food reinforcement. All experiments and animal care protocols were in accordance with the New York State Psychiatric Institute Institutional Animal Care and Use Committees and Animal Welfare Regulations.
Drug treatment
SB242084 was purchased from Tocris Bioscience and dissolved in either 0.05% DMSO in 0.9% saline, or in 0.9% saline, with 0.05% DMSO in 0.9% saline, or 0.9% saline used as appropriate vehicle controls. In all experiments, the dose was 0.75 mg/kg and intraperitoneal injections were given 20 min before the start of behavior tests. The dose and timing of injection were selected based on results of previous studies (Simpson et al., 2011; Bailey et al., 2016).
Flupenthixol was purchased from sigma- Aldrich and dissolved in sterile aCSF (CNS perfusion fluid from CMA/Harvard Apparatus). Infusions were performed via chronically implanted bilateral guide cannula targeted to the DMS. The dose was 0.5 μg or 1.5 μg/0.5 μl per infusion site, infused at a rate of 0.1 μl/ min. The internal cannula was left in place for 2 min after the infusion to prevent spread of the drug up the cannula. The dummy cannula was then replaced before starting behavioral testing. To determine the site of infusion, at the end of the experiment, we infused 0.5 μl of dye through the cannula, extracted the brains, and made 25 μm cryosections. Sections were thaw mounted and examined under a light microscope to determine the precise site of infusion. Mice with misplaced guide cannula or tissue damage were excluded from the analysis.
Apparatus
Experimental chambers (ENV-307w; Med Associates), equipped with liquid dippers and pellet dispensers, were used in the experiment. Unless otherwise noted, the apparatus was identical to that used by Drew et al. (2007). Two retractable levers were mounted on either side of a feeding trough, and a house light (model 1820; Med Associates) located at the top of the chamber was used to illuminate the chamber during the sessions. Reward outcomes were typically 0.01 ml of evaporated milk, with exceptions detailed for each experiment.
Behavioral procedures
Basic lever press training
Lever press training was performed as described previously (Drew et al., 2007).
Concurrent value choice (CVC)
Sixteen male mice were first trained to press a lever on each side of the operant chamber using the basic lever press training procedure. The lever on one side of the chamber delivered a liquid sucrose solution while the opposite lever delivered a 14 mg sucrose pellet. Subjects were then exposed to increasing random ratio (RR) schedules on each lever for the different outcomes, receiving one session for a single outcome per day. The progression was as follows: 3 d on continuous schedule of reinforcement, 3 d on RR05, 3 d on RR10, and 3 d on RR20.
Once subjects were trained to make presses on lever A for liquid sucrose and lever presses on lever B for a sucrose pellet, they were then moved to the CVC task. Each CVC session consisted of two separate phases (1) 10 single-lever forced-choice trials in which either the sucrose lever or the pellet lever was presented; and (2) subjects then had to complete the press requirement on that lever to obtain the reward and gain experience with the cost of that particular outcome on a given day. Each lever was presented 5 times in a semirandom order. Once subjects completed all 10 of the forced-choice trials, they were then presented with 20 free choice trials in which subjects were presented with both levers and they could choose which one they worked on to obtain the reward.
Drug treatment during CVC testing.
After 2 weeks of baseline training in the 20% sucrose solution versus pellet condition, subjects were then tested over the course of 4 consecutive weeks, receiving intraperitoneal injections 20 min before the start of testing. Subjects first received vehicle injections, followed by 0.75 mg/kg SB242084, followed by another week of vehicle, followed by another week of 0.75 mg/kg SB242084. Subjects were then exposed to 2 baseline weeks of 05% sucrose versus pellets in the CVC and then underwent 4 consecutive weeks of testing with the same drug treatment order as described for the 20% sucrose versus pellets condition. Data were averaged across the 2 weeks of each treatment type.
Pavlovian conditioning
For the FSCV experiment, 5 C57BL/6J:129SvEv/Tac F1 male mice were trained for 16 consecutive days in a Pavlovian conditioning paradigm, which consisted of 12 conditioned stimulus-positive (CS+) trials and 12 CS− trials occurring in a pseudorandom order. Each trial consisted of an auditory cue presentation for 10 s, of either an 8 kHz tone or white noise (counterbalanced between mice) and after cue offset a milk reward was delivered only in CS+ trials, whereas no reward was delivered in CS− trials. There was a 100 s variable intertrial interval, drawn from an exponential distribution of times. Head entries in the food port were recorded throughout the session, and anticipatory head entries during the presentation of the cue were considered the conditioned response.
Drug treatment during Pavlovian testing.
On day 14, all mice were injected with saline to habituate them to intraperitoneal injections. On day 15, mice were injected intraperitoneal with either SB242084 or vehicle 20 min before starting the behavior, and they received the opposite treatment on day 16 (i.e., injection order was counterbalanced). Unexpected food pellets were delivered before and after the injections to check the functioning of the electrodes.
Concurrent effort choice (CEC)
After basic lever press training, 15 male mice were then exposed to increasing RR schedules on a single lever. The progression was as follows: 3 d on continuous schedule of reinforcement, 3 d on RR05, 3 d on RR10, and 3 d on RR20. Subjects were next trained to make lever holds on the opposite lever. This was done using the VIH training program. Subjects were exposed to 3 d of each of the following VIH programs in increasing order (0.5, 1.0, 2.0, 4.0, 6.0, 8.0, 10.0 s).
Once subjects were trained to make presses on lever A and holds on lever B, they were then moved to the CEC task. In this task, subject could lever press on one lever, or lever hold on the opposite lever to obtain a reward. The session consisted of two separate phases. First, there were 10 forced-choice trials in which either the press lever or the hold lever was presented (each lever was presented 5 times). Once subjects completed all 10 of the single-lever forced-choice trials, they were then presented with 40 free choice trials. In this phase of the session, subjects were presented with both levers and they were able to choose which one they worked on to obtain the reward. We maintained the hold duration required at a constant value (10 s) and varied the press requirement over days. We used the following press requirements: 10, 20, 40, and 80.
Drug treatment during CEC testing.
After 2 weeks of baseline training in the CEC task, subjects were then tested over the course of 4 consecutive weeks, receiving intraperitoneal injections 20 min before the start of testing. Subjects first received vehicle injections, followed by 0.75 mg/kg SB242084, followed by another week of vehicle, followed by another week of 0.75 mg/kg SB242084. Data were averaged across the 2 weeks of each treatment type.
Progressive ratio (PR)
Before drug testing, 15 male mice were trained for the PR testing as previously described (Bailey et al., 2016). The specific schedule used was a PR 3 × 2, meaning the first reward required 3 lever presses and the requirement for each subsequent reward was multiplied by 2 (first 5 requirements: 3, 6, 12, 24, 48, etc.). Sessions lasted for either 2 h or until 3 min elapsed without a single press occurring, whichever occurred first.
Drug treatment during PR testing.
Subjects were tested in a PR 3 × 2 over 2 consecutive weeks for 5 d per week, receiving intraperitoneal injections 20 min before testing. Half of the subjects received 5 d of vehicle followed by 5 d of 0.75 mg/kg SB242084, and half received 5 d of 0.75 mg/kg SB242084 followed by 5 d of vehicle. Data are averaged over the 5 d of testing in each condition.
Extinction testing
After basic lever press training, 28 female mice were run on increasing RR schedules over days (RR5, RR10, and RR20) being tested on each schedule for 3 consecutive days. Sessions lasted for 60 min.
Drug testing during extinction.
Following the third day on an RR20 schedule, subjects were assigned into either a vehicle or drug group, matching the two groups based on press rates in RR20 so there were no baseline differences between the groups. Subjects were then tested in a 60 min extinction session, which was identical in all ways to the RR training, except that reward was never delivered. Subjects received intraperitoneal injections of either vehicle or 0.75 mg/kg SB242084 20 min before behavioral testing.
Effort-based choice
After basic lever press training, mice underwent the effort-based choice test as previously described (Simpson et al., 2011). Each session lasted 60 min during which a single milk reward (dipper presentation for 2 s) was earned after an average of 15 lever presses (RR15). During the entire session, home cage chow (Prolab) was available on the floor of the chamber.
Dialysis experiment.
After a minimum of 10 RR15 training sessions, 8 female mice underwent three different dialysis sampling sessions on 3 different days each ∼1 week apart. In one session mice received intraperitoneal vehicle 20 min before behavioral testing, in one session they received intraperitoneal SB242084 0.75 mg/kg 20 min before behavioral testing, and in one session they received the same drug treatment but remained in a holding chamber without behavioral testing. The order of sampling sessions was counterbalanced across subjects.
DMS infusions.
After 5 RR15 training sessions, 7 female mice were infused with aCSF or 1.5 μg/0.5 μl per side on consecutive testing days. The same mice were then tested with combined intraperitoneal injection and DMS infusions in the following order: vehicle injection with sham DMS infusion (baseline), SB injection with flupenthixol infusion (0.5 μg/0.5 μl per side), and vehicle injection with flupenthixol infusion (0.5 μg/0.5 μl per side).
FSCV
Surgeries for voltammetry.
Adult 10- to 12-week-old mice were chronically implanted with carbon-fiber microelectrodes constructed in house, as previously described (Clark et al., 2010). Mice were anesthetized using isoflurane in O2 (2.5% induction, 1.5% maintenance) and implanted with the electrode aimed at the NAc core (0.11 anteroposterior, −0.15 mediolateral, ∼−0.37 dorsoventral relative to bregma) and a contralateral Ag/AgCl reference electrode. To confirm the targeting of the working electrode, a stimulating electrode was temporarily placed in the ipsilateral medium forebrain bundle (−0.147 anteroposterior, −0.07/−0.15 mediolateral, −0.38 dorsoventral, 10° angled) as previously described (Oleson et al., 2014). Once electrically evoked DA release was detected in the NAc core, the stimulating electrode was removed and the working electrode was secured to the skull using dental cement. A custom-made miniature head stage was also fixed in the mouse's skull to connect the freely-moving mouse to the recording system. After surgery, mice recovered for 2 weeks before begging food restriction and behavioral training.
Voltammetric recordings.
During recording sessions, DA was detected using FSCV by applying a triangular waveform (−0.4 V to 1.3 V at 400 V/s) every 100 ms to the implanted carbon fiber microelectrodes using Tar Heel CV software. To extract the DA component from the recordings, we obtained the DA oxidation current using background subtraction (set 0.5 s before the event) and principal component regression against a training set of electrically evoked DA and pH cyclic voltammograms with two principal components (Heien et al., 2005; Keithley and Wightman, 2011). The background for each trial was the average of the last 5 scans. Before each behavioral session, we delivered two unexpected milk presentations to examine the resultant voltammogram and determine electrode functionality. The average of these two DA release events for each subject was used to normalize the data for that animal on the same day. DA concentration was obtained from the resulting current using a calibration factor of ∼80 nA/mm. This factor was based on a dataset developed in vitro to quantify DA oxidation current versus nonfaradaic background current using the method of Roberts et al. (2013).
FSCV data analysis.
Electrochemical data were analyzed using software written in LabVIEW (National Instruments) and MATLAB (The MathWorks).
Microdialysis
Surgeries for microdialysis.
Female mice were stereotaxically implanted bilaterally with microdialysis guide cannulae for the insertion of CMA7 microdialysis probes (CMA) under ketamine/xylazine anesthesia. Guide cannulae were positioned to place the 1 mm membrane of dialysis probes in either the dorsal striatum or the nucleus accumbens. Guide coordinates were as follows: anteroposterior 1.55, mediolateral −1.25, dorsoventral −2.90 from skull surface for NAc; and anteroposterior 1.15, mediolateral 1.40, dorsoventral −1.60 from skull surface for the dorsal striatum. The guides, a head block tether (Instech), and four stabilizing screws were secured using FujiCEM 2 glass ionomer cement (GC). After the surgery, animals recovered for 2 weeks before starting food restriction and behavioral training.
Microdialysis sample collection.
All subjects underwent three different sampling sessions on 3 different days each ∼1 week apart. In one session mice received intraperitoneal vehicle 20 min before behavioral testing, in one session they received intraperitoneal SB242084 0.75 mg/kg 20 min before behavioral testing, and in one session they received the same drug treatment but remained in a holding chamber without behavioral testing. The order of sampling sessions was counterbalanced across subjects. On sampling days, CMA7 microdialysis probes (cupraphane membrane 1 mm in length; outer diameter, 0.24; molecular cutoff, 6 kDa; CMA) were inserted into the guide cannulae. Once inserted, the probes were continuously perfused with aCSF (125 mm NaCl, 2.5 mm KCl, 0.9 mm NaH2PO4, 5 mm Na2HPO4, 1.2 mm CaCl2, 1 mm MgCl2, pH 7.4) at a flow rate of 1 μl/min for 3 h of equilibration while the mouse was inside a circular container within the operant test chamber. After the equilibration period, three samples were collected at 20 min intervals to establish a baseline DA level. Following baseline, mice received an intraperitoneal injection of the drug treatment (SB242048 0.75 mg/kg in 0.05% DMSO in 0.9% saline) or vehicle (0.05% DMSO in 0.9% saline) and sample collection continued at 20 min intervals. Sample collection was staggered by 5 min from event time points to account for the delay in sample recovery through the tubing and swivel.
Preprocessing of microdialysates.
Dialysate samples were collected on ice containing 3.3 μl of HeGA buffer (0.1 m glacial acetic acid, 0.1 mm EDTA; American Chemical Society grade reagent; 99.4%–100.06%, and 0.12% oxidized l-glutathione, pH adjusted with filtered NaOH to 3.70). Samples then underwent dansyl chloride derivatization, based on the method of Nirogi et al. (2013). Briefly, 15 μl of sample was placed in a polypropylene cryogenic vial with 5 μl of 50 nm DA-D4 in 1 mm HCl, 5 μl of 1 m NaHCO3, and 25 μl of freshly prepared 1% dansyl chloride solution in acetone. Sample was then vortexed and incubated at 65 degrees for 10 min, chilled on ice for 2 min, and then stored in liquid nitrogen until quantification. DA standards (0, 0.1, 0.2, 0.4, 0.8, 1, 1.6, 3, 3.2, 6.4, 9, 12.8 nm) prepared in aCSF underwent dansyl chloride derivatization in parallel.
Quantification of DA in microdialysates by ultra performance liquid chromatography/tandem mass spectrometry (UPLC/MS/MS) analysis.
Quantification was performed by the Biomarker Core at the Irving Center for Clinical and Translational Research at Columbia University Medical Center. Frozen preprocessed samples were thawed and subjected to a nitrogen stream followed by resuspension in 20 μl acetonitrile. The sample was transferred to an MS vial, and an additional 20 μl of water was added for analysis. All assays were performed on a Xevo TQ MS ACQUITY UPLC system (Waters). The system was controlled by MassLynx software version 4. 1. The sample was maintained at 4°C in the autosampler, and a volume of 10 μl was loaded onto an ACQUITY UPLC BEH phenyl column (3 mm inner diameter × 100 mm with 1.7 μm particles, Waters, P/N 186004673), preceded by a 2.1 × 5 mm guard column containing the same packing (Waters, P/N 186003979). The column was maintained at 40°C. The flow rate was 300 μl/min in a binary gradient mode with the following mobile phase gradient: initiated with 50% phase A (water containing 0.1% formic acid) and 50% mobile phase B (acetonitrile containing 0.1% formic acid). Gradient of acetonitrile was increased linearly to 99% over 2 min and maintained until 5 min. Then the column was conditioned by using the initial gradient for 1 min, and the next sample was injected. Positive ESI-MS/MS with multiple reaction monitoring mode was performed using the following parameters: capillary voltage 4 kV, source temperature 150°C, desolvation temperature 500°C, and desolvation gas flow 1000 L/h. The optimized cone voltage was 46 V. Collision energy was 64 eV. For multiple reaction monitoring analysis, the following m/z transitions were used: DA 853.3→170.2 and 853.3→263.1 for quantification and qualification, respectively (DA-d4 857.3→170.2).
Confirmation of microdialysis probe placement.
After the experiment, mice were anesthetized using a mixture of ketamine and xylazine. Previously used CMA7 microdialysis probes were stripped of their membranes under a dissection microscope and inserted into the implanted guide cannulae. Methylene blue dye was infused for 1 min at a flow rate of 1 μl/min. After 20 min, the mice were killed, the probes were removed, and the brain was extracted and frozen in isopentane at −20C and stored at −80C until cryosectioning at 20 μm and thaw mounting. Probe placement was determined based on dye infusion site.
Results
Systemic SB242084 treatment does not alter animal's choice for different types of reward
Here we examined whether 0.75 mg/kg SB242084 mediates its effect on motivation by changing the animal's behavioral response to reward. We implemented a novel concurrent choice paradigm (M.R.B., manuscript in submission), in which the mice could choose between lever pressing for food pellets or lever pressing for a sucrose solution (Fig. 1a). In each testing session, mice first completed 10 single-lever forced trials, which served the purpose of informing the animals of the response cost associated with each reward type for that day. For the remaining trials in the session, both levers were presented to determine choice for sucrose versus pellets. The concentration of sucrose was either 05% or 20% while the quantity of pellets delivered was constant across all sessions. Figure 1b shows that SB242084 treatment did not affect reward choice in this task. A three-factor ANOVA revealed a significant main effect of the pellet cost (F(3,15) = 249.20, p < 0.001), and sucrose concentration (F(1,15) = 319.29, p < 0.001), which resulted in a pellet cost × sucrose concentration interaction (F(3,15) = 18.10, p < 0.001), but there was no effect of treatment with SB242084 (F(1,15) = 0.13, p = 0.72) or any drug interactions in this task.
Another way of analyzing choice behavior is to calculate each individual subject's point of subjective equality (PSE), the extrapolated number of lever presses at which the subject choses the sucrose reward and the pellet reward with equal probability. As shown in Figure 1c, the PSE for each subject was unaltered by SB242084 treatment (F(1,15) = 0.13, p = 0.72). Sucrose concentration was the only factor, which significantly impacted the PSE (F(1,15) = 178.46, p < 0.001). This result shows that the enhancement in goal-directed activity driven by SB242084 is not due to an altered preference for sweeter rewards or rewards of different types.
SB242084 enhanced goal-directed activity is not due to altered encoding of rewards or reward cues
Although SB242084 does not appear to alter reward preference, the motivational effect of SB242084 could result from a general increase in sensitivity to all rewards. Because reinforcement value is encoded by mesoaccumbal DA, we used FSCV to determine whether SB242084 alters phasic DA released in the NAc in response to primary reinforcers or cues predicting reinforcers. Mice with chronic carbon fiber microelectrodes implants were trained in a Pavlovian schedule in which two different auditory stimuli were pseudo-randomly presented. One stimulus (CS+) was paired to delivery of a milk reward (unconditioned stimulus [US]), while the other stimulus (CS−) was not. Details of the procedure are provided in Figure 2a. We measured the number of anticipatory head entries into the food port made during each of the 10 s CS presentations and subtracted the number of head entries made in the last 10 s of the intertrial interval before each CS. A two-way repeated-measures ANOVA revealed a significant interaction between cue type and session (F(15,60) = 2.43, p = 0.008), demonstrating that the mice learned the relevance of the two different cues. Sidak's multiple-comparisons test showed that the mice began responding differentially to the cues in the ninth session (Fig. 2b). Mice were tested for a total of 16 d. On day 14, all mice were injected with saline to habituate to an intraperitoneal injection. On days 15 and 16, mice received an intraperitoneal injection of either 0.75 mg/kg SB242084 or vehicle 20 min before the behavioral session (order was counterbalanced across the group), during which DA release was recorded.
Figure 2c depicts DA release in response to CS+ and CS− cues, normalized to unexpected pellet evoked release recorded for each subject (for details, see Materials and Methods). Repeated-measures ANOVA determined that CS-evoked maximum release was affected by trial type (CS+ or CS−) (F(1,58) = 38.38, p < 0.0001) but not by drug treatment (F(1,58) = 0.25, p = 0.62), and no trial type × drug interaction was observed (F(1,58) = 0.3, p = 0.59) (Fig. 2d). Similarly, no trial type × drug interaction was observed for DA release accumulation within 10 s of tone onset (F(1,58) = 0.13, p = 0.72; Fig. 2e). Figure 2f compares DA release in response to US delivery at CS offset to a comparable period following CS− offset, as no reward was delivered in those trials. Repeated-measures ANOVA determined that US-evoked maximum release was affected by trial type (US+ or US−) (F(1,55) = 30.35, p < 0.0001), but not by drug treatment (F(1,55) = 2.45, p = 0.12), and no trial type × drug interaction was observed (F(1,55) = 1.85, p = 0.18; Figure 2g). Again, no trial type × drug interaction was observed for DA release accumulation within 5 s of reward delivery/tone offset (interaction: F(1,55) = 0.25, p = 0.62; Fig. 2h). These results show that SB242084 has no impact on accumbal dopaminergic encoding of rewards, or cues that are have already been established to predict reward.
Systemic SB242084 treatment does not alter animal's choice for different forms of work
All goal-directed behaviors result from computation of cost versus benefit. Our FSCV studies showed that SB242084 does not alter benefit (reward) encoding; therefore, we focused the remainder of our studies on the cost (effort) component of motivation. We previously showed that SB242084 enhances performance in tasks that involve different types of work requirement. SB242084 improved performance on a progressive schedule of reinforcement where an increasing number of lever presses are required for each successive reward (Simpson et al., 2011; Bailey et al., 2016) and when single-lever presses of increasing duration are required for each successive reward (Bailey et al., 2016). To determine whether acute SB242084 treatment would alter an animal's preference for these two different types of work, we implemented a novel concurrent choice task (M.R.B., manuscript in submission), in which mice can earn reinforcers by holding one lever for a fixed period of time or by pressing a second lever for a fixed number of presses (Fig. 3a). By varying the requirement of the pressing lever across sessions while keeping the time requirement of the holding lever constant, effort choice functions and the PSE for holding and pressing were obtained for each subject. Figure 3b shows that effort choice changed as a function of the lever press requirement (F(3,14) = 75.85, p < 0.001), but treatment with SB242084 did not affect preference for holding versus pressing in this task (F(1,14) = 2.08, p = 0.15). The PSE for each subject is shown in Figure 3c, and this measure was also not altered by SB242084 treatment (t(29.97) = 0.53, p = 0.60). Subjects' vehicle PSE and SB242084 PSE were highly correlated (r(15) = 0.96; Figure 3d). These results show that the enhancement of goal-directed activity elicited by SB242084 is not due to a change in the computation of the relative effort of different response options.
The functionally selective 5-HT2C ligand SB242084 enhances motivation in goal-directed activity by increasing response vigor
That systemic SB242084 did not affect choice for different work types is consistent with SB242084 enhancing performance in tasks involving either lever holding (Bailey et al., 2016) or lever pressing (Simpson et al., 2011; Bailey et al., 2016). This includes enhancing performance on a PR schedule in which the number of presses required increases for each subsequent reinforcer. We used this schedule again to examine the effect of SB242084 on lever pressing in more detail. Using a PR × 2 schedule, we replicated our earlier result; treatment with SB242084 increased how long subjects continued to work (Mantle–Cox log rank test of survival curves, χ2 = 9.94; p = 0.0016; data not shown). SB242084 treatment also increased the rate of responding (Veh: M = 53.48 ± 7.03; SB: M = 69.83 ± 9.49; t(20) = 4.65, p = 0.002; Fig. 4a), and this increase was most apparent when the lever press requirement was >6 (Fig. 4b).
For a finer characterization of the temporal structure of responding, we analyzed bout length, pauses in responding, inter-response times (IRTs), and postreinforcement pauses (PRPs), as depicted in Figure 4c. Response bouts were defined as consecutive responses made with <2 s elapsing between responses, separated by pauses in responding. We chose 2 s as the duration to distinguish between active bouts and pauses in responding after examining the distribution of IRTs for each subject. This appeared to be a mixture distribution of very small IRTs (exponentially distributed) and much longer IRTs (approximately normally distributed). A 2 s cutoff time divided these two distributions of times most appropriately across all subjects.
The average bout length across all press requirements was increased with SB242084 treatment (Veh = 7.81 ± 0.95; SB = 11.15 ± 1.39; t(20) = 6.32, p < 0.0001; Fig. 4d). Bout length was affected by press requirement under both treatment conditions (Fig. 4e), in a pattern similar to that observed for response rate (Fig. 4b). The average duration of pauses between bouts of pressing was decreased by SB242084 treatment (Veh = 9.29 ± 0.55; SB = 7.42 ± 0.56; t(20) = 4.33, p = 0.0003; Figure 4f), and pause duration increased as a function of response requirement under both treatment conditions (Fig. 4g). Figure 4b, e, g shows that bout length, pause duration, and response rate are not altered by SB at the lowest press requirements. In both conditions, the average bout length is greater than the response requirement (Fig. 4e), resulting in very few pauses (Fig. 4f). Only when the response requirement is ≥12, when mice earn rewards by working in bouts with pauses, were bout length, pause duration, and overall press rates affected by drug treatment. At the end of the session, when the requirement is 728 lever presses, there appears to be no effect of treatment. However, many mice never completed this requirement: under vehicle and SB treatment conditions, only 9 of 21 and 16 of 21 mice, respectively, ever complete the 728 press requirement at least once in the 5 d of testing in each condition. Thus, the data reflect an incomplete sample, and it is likely that other factors are at play, including fatigue and distraction.
As well as shortening pauses, SB treatment also resulted in more rapid execution of responses within a bout (IRTs Veh = 0.52 ± 0.05; SB = 0.44 ± 0.05; t(20) = 5.29, p < 0.0001; Fig. 4h). Figure 4i shows that IRT appeared to be reduced by SB across all press requirements. Finally, we observed that SB treatment had no effect on the duration of PRPs (Veh = 2.81 ± 0.53; SB = 2.14 ± 0.34; t(20) = 1.23, p = 0.23; Figure 4j). A trend for the PRP to increase as a function of press requirement was observed in both conditions (Fig. 4k).
In summary, SB treatment enhances performance in the PR task by increasing response vigor: longer bouts of responding, shorter pauses between bouts of responding, and more rapid execution of presses within response bouts, leading to an overall increase in response rates. Consistent with our other results showing that SB does not alter reaction to reward, the durations of PRPs are not affected by SB242084.
SB242084 enhanced goal-directed activity is not due to an increased resistance to extinction
Persistence in responding could also be due to changes in extinction processes, especially under progressive schedules in which greater numbers of nonreinforced responses are required as the schedule progresses. Therefore, we tested the effect of systemic SB242084 on behavioral extinction. Mice were trained on RR schedules of reinforcement, with the ratio increasing progressively across sessions, up to RR20. After stable performance on this schedule, mice were treated with either drug or vehicle 20 min before a testing session in which no reinforcers were delivered. Figure 5a shows no differences in press rates between the two treatment assignment groups during the last RR20 session before extinction testing. Average press rates for the mice assigned to the VEH group were 34.67 ± 3.36, and to the SB group 33.87 ± 3.19 (t(26) = 0.17, p = 0.86). Figure 5b depicts the rate of decay in pressing during the extinction session fit to the exponential decay equation: Y = (a × exp (−b × x), where a represents the peak press rate and b represents the decay rate. The lever press data for each individual subject were fit to this equation to determine whether extinction was affected by treatment. Fig. 5c shows that SB had no effect on peak press rate (a) Veh 444.9 ± 32.37, SB 466.8 ± 42.11 (t(26) = 0.41, p = 0.68). Figure 5d shows no effect of drug on rate of decay (b) Veh 0.051 ± 0.004, SB 0.057 ± 0.011 (t(26) = 0.47, p = 0.64). Figure 5e shows no effect of drug on goodness of fit (R2) Veh 0.807 ± 0.032, SB 0.780 ± 0.045 (t(26) = 0.49, p = 0.62).
The effects of systemic SB242084 are most beneficial when work requirements are high
Given that SB242084 changed response dynamics in the PR only when response requirements were increased within each session, we next investigated whether SB242084 increases vigor in sessions with constant work requirements, across a range of demands. We tested the effect of SB242084 on different fixed ratios across sessions (FRs 10, 20, 40, and 80) and found that the drug's effects on performance were selectively beneficial at higher work requirements. Figure 6a depicts the average rate of lever pressing as a function of time across trials for the four different work requirements. Differences in the distribution of press rates across the trial appear to be affected by both ratio and drug treatment, with peak press rates being highest in the FR40 sessions in both conditions. We examined the time taken to complete each trial as a function of lever press requirement. Repeated-measures ANOVA detected an effect of ratio (F(3,42) = 63.69; p < 0.0001), an effect of drug treatment (F(1,14) = 17.4; p = 0.0009), and a significant ratio × drug treatment interaction (F(3,42) = 8.89; p = 0.0001). Bonferroni's multiple-comparisons test showed that time to complete the FR was decreased by drug only in FR80 trials (p < 0.0001). This may be because there has been enough lever pressing for small differences in the IRT to impact the overall time taken to complete the trial. To test this idea, we looked at the IRTs (as defined in Fig. 4c) across FRs (Fig. 6c) and found that SB led to a significant decrease in average IRTs (F(1,14) = 8.453; p = 0.01). This increase in IRTs occurred across all trial types as there was no effect of ratio, and no interaction. Along with our previous data from PR experiments, these results show that SB242084 increases the vigor of responding, which becomes increasingly advantageous over more demanding work requirements.
Systemic SB242084 potentiates effort-dependent DA release in the dorsomedial striatum and not the NAc
Because response vigor is modulated by DA signaling in the dorsal striatum, we used in vivo microdialysis to examine the effect of SB242048 on tonic DA levels in the dorsal striatum and ventral striatum (NAc) during goal-directed behavior. Mice were tested in an effort-based choice paradigm, in which they could lever press for a preferred reinforcer (evaporated milk) or consume home cage chow that was ad libitum on the floor of the test chamber. Because rate of reinforcement influences extracellular DA levels, we set the work requirement such that SB242084 treatment would not cause a large increase in the number of reinforcers earned (e.g., RR15). Drug treatment significantly increased extracellular DA concentrations in the dorsomedial striatum when the mice were engaged in the effort-based choice task (Fig. 7a). This increase in striatal DA occurred selectively when mice were engaged in behavior, as no increase was seen when the mice were treated with drug but remained in the holding chamber (used for the baseline period) for the entire duration of the dialysis session, instead of entering the operant chamber to perform the task (Fig. 7a). Two-way ANOVA revealed a significant effect of treatment (F(2,14) = 10.34, p = 0.002) and a significant treatment × time interaction (F(14,98) = 4.33, p < 0.0001). The maximum increase in DA during the last three collection time points (Behavior 2 through Postbehavior) was significantly different across conditions (Fig. 7b; Work + Vehicle = 1.07 ± 0.04, Work + SB242084 = 1.38 ± 0.63, No Work + SB242084 = 1.1 ± 0.06, repeated-measures one-way ANOVA: F(1.756,12.29) = 7.11, p = 0.01). No such increase in DA in the NAc was observed (Fig. 7c,d). Placements of the microdialysis probes are represented in Figure 7e.
We analyzed behavior to determine whether heightened extracellular DA levels in the dorsal striatum were related to any changes in behavior. As shown in Figure 7f, we found that subjects treated with SB made more lever presses for the preferred reward throughout the entire session compared with vehicle (t(7) = 3.09, p = 0.018). Because of differences in the equipment used for the effort-based choice task and the PR and FR schedules, we could not measure IRTs (the time between the end of one lever press and the initiation of the next lever press) in this task. Instead, we used a related measure of vigor, the interval between lever press initiations, within bouts of lever pressing. When treated with SB, mice initiated lever presses more rapidly than when treated with vehicle (Fig. 7g; t(7) = 2.52, p = 0.04). Six of eight subjects ate more chow in the vehicle, compared with SB treatment condition, but a paired t test showed no significant difference in the amount of chow consumed across treatment conditions: Vehicle = 0.49 ± 0.07, SB = 0.42 ± 0.07 (t(7) = 0.80, p = 0.45).
DA receptor antagonism in the DMS reduces responding and prevents systemic SB242084 treatment from increasing response vigor
To determine whether DMS DA modulates operant response vigor, we implanted chronic guide cannulae into the DMS for acute drug infusion before behavioral testing in the effort-based choice task (RR15 for milk vs chow ad libitum). Infusion of the nonselective DA receptor antagonist flupenthixol (1.5 μg per side) slowed responding on the RR15 schedule. The interval between successive lever presses was increased (t(6) = 2.796, p = 0.0313; Fig. 8a), reducing the total number of lever presses made (t(6) = 2.939, p = 0.026; Fig. 8b), resulting in fewer earned rewards (t(6) = 3.412, p = 0.014; Fig. 8c). This dose of flupenthixol did not reduce interest in the reinforcer, as there was no difference in the percentage of earned rewards that were not collected (t(6) = 0.807, p = 0.45).
To test our hypothesis that systemic SB242084 increases vigor by potentiating DA release in the DMS, we tested whether the effects of SB could be blocked by DA antagonist injection in the DMS, at a dose that has no effect alone. To limit the number of infusions, we examined response vigor under three conditions: Vehicle injection with Sham DMS infusion (baseline), SB injection with flupenthixol infusion (0.5 μg per side), and Vehicle injection with flupenthixol infusion (0.5 μg per side). Repeated-measures one-way ANOVA revealed no differences between these conditions in any effort-related measures: average press initiation interval, F(1.49,8.94) = 2.82, p = 0.12 (Fig. 8e); total lever presses, F(1.45,8.72) = 4.0, p = 0.067 (Fig. 8f); rewards earned, F(1.488,8.93) = 1.078, p = 0.359 (Fig. 8f); or the percentage of uncollected rewards, F(1.44,8.65) = 0.508, p = 0.561 (Fig. 8h).
Discussion
Deficits in motivation are a debilitating symptom of several psychiatric and neurological conditions, and there are currently no treatments available to alleviate this symptom. We previously showed that the functionally selective ligand of a serotonin receptor, SB242084, increases motivation in mice (Simpson et al., 2011; Avlar et al., 2015; Bailey et al., 2016). This is consistent with recent reports that acute systemic treatment with the selective serotonin reuptake inhibitor, fluoxetine, reduces responding in effort-based choice tasks (Yohn et al., 2016a, b). Because motivated behaviors are driven by a computation between the costs and benefits related to the associated behavior, there are multiple possible mechanisms by which a drug, such as SB242084, might enhance motivation. At the stage of processing incoming information, both the reward value and effort demands must be encoded and cost-benefit computations must be made to guide the selection of actions. In addition to action selection, this information is also used to modulate the vigor (speed, amplitude, and frequency) and duration that actions are sustained to overcome the effort demanded by a given situation. Our analyses dissected some of these different behavioral processes and identified the mechanism by which SB242084 enhances motivated behavior in mice.
Motivation can be enhanced without changing behavioral or dopaminergic response to reward
Although we previously showed that acute treatment with SB242084 did not increase food intake (Bailey et al., 2016), home cage feeding may not reflect possible changes in reactivity to appetitive reinforcers. We directly tested the effect of SB242084 on reward choice and found that the drug did not alter preference or sensitivity to different types of rewards. We also evaluated the effect of SB242084 on reward processing at the neurochemical level. Using FSCV, we found that the drug does not alter phasic DA release in the NAc in response to reward or cues that predict reward. Because VTA DA neuron activity and DA release in the NAc are both associated with reward expectation, our FSCV data suggest that the enhancement in motivation following treatment with SB242084 is not the result of altered reactivity to rewards. This conclusion is consistent with the report that 5-HT2CR agonists impair operant conditioning in rats because of changes in motor capacity, rather than changes in the incentive value of food rewards (Bezzina et al., 2015). Our novel identification of a pharmacological manipulation that increases motivation without changing reactivity to rewards may be clinically important as it suggests a low potential for abuse.
Goal-directed behavior can be enhanced without increasing non–goal-directed actions
Some drugs, including amphetamine, increase instrumental responding but as a consequence of a generalized increase in locomotor activity (Bailey et al., 2015). Using a battery of behavioral tasks, we previously reported that SB242084 increases effort expenditure for food rewards without nonspecific hyperactivity or arousal (Bailey et al., 2016). Here we provide further evidence that only goal-directed actions are enhanced by SB242084. In an extinction test, where no reinforcement was available, SB242084 treatment did not alter response rates at any point during the session. The decline in response rate over the course of the session was also unaffected. This suggests that the drug's effect is highly specific to the process of actively engaging in goal-directed responding only when the goal is obtainable.
SB242084 treatment increases total goal-directed output by increasing the vigor of responding
We performed a detailed analysis of the temporal dynamics of responding in both progressive (PR) and fixed (FR) schedules of reinforcement. In a progressive schedule, SB242084 treatment increased the overall response rate. This increase in response rate had multiple contributing factors. Mice made more rapid presses within response bouts, their response bouts were longer, and they exhibited shorter pauses in between bouts of responding. SB242084 did not alter the time to return to work after reward delivery (PRP). A shortening of IRTs while the PRP is unaffected is the opposite of what has been found after DA manipulations in the NAc. In a comparatively low effort schedule (FR8), infusion of DA antagonists into the NAc disrupts returning to work (PRP), without effecting IRTs (Nicola, 2010). These contrasting findings are consistent with our observation that tonic and phasic DA in the NAc are not altered by SB242084 treatment. The same pattern emerged in FR schedules. SB242084-treated subjects reached higher peak response rates by executing responses more rapidly. These changes in response dynamics driven by drug treatment can be described as increases in response vigor and goal-directed persistence.
SB242084 treatment increases effort-related DA release in the dorsal striatum and not the NAc
We measured extracellular DA in the DMS during an effort-based choice task and found that SB242084 potentiated DA release, concomitant with an increase in response vigor. Several lines of evidence suggest that such an increase in DA in the DMS can support an increase in response vigor. First, we found that the drug only increased DA in DMS when the subjects had to execute effortful goal-directed action. Simply injecting the drug did not raise DA levels. Furthermore, in rats, lesion of the DMS impairs the modulation of response vigor in response to changes in the rate of reinforcement (Wang et al., 2013). Neuronal recordings in the DMS in both rodents and primates have identified neurons that encode aspects of response vigor, including velocity and amplitude of movements (Opris et al., 2011; Panigrahi et al., 2015; Bartholomew et al., 2016). Dopaminergic input into DMS is essential for adaptive changes in movement vigor, as evidenced by the loss of vigor modulation in a mouse model of Parkinson's disease in which dopaminergic innervation of the DMS progressively declines (Panigrahi et al., 2015). When nigrostriatal DA level declines, mice are selectively unable to complete the most demanding trials, while still able to complete less demanding ones. Therefore, DA tone is particularly important when high effort is demanded. Further evidence that DMS DA modulates vigor comes from optogenetic experiments. Response vigor can be increased by stimulating striatal output neurons in the direct pathway and decreased by stimulating neurons in the indirect pathway. DA antagonists abolish optogenetically driven changes in response vigor (Yttri and Dudman, 2016).
In an effort-based choice task, DA blockade in the DMS reduces responding and prevents systemic SB242084 treatment from increasing response vigor
Our observation that systemic SB242084 potentiates DA release in the DMS during effortful behavior is consistent with the finding that modulation of 5-HT2C receptors alters midbrain DA cell activity in brain slices and anesthetized animals (Di Matteo et al., 1999; Navailles et al., 2004). Infusing a nonselective DA antagonist into the DMS reduced response vigor, the opposite effect of SB242084. We also show that DA in the DMS is causally involved in the invigorating effect of SB242084. SB treatment did not increase vigor when flupenthixol was infused into the DMS. Together, our results show that an interaction between serotonin receptor signaling and DA enhances goal-directed vigor and persistence in mice.
5-HT2C receptor modulation is a promising target for the treatment of apathy
SB242084 is one of several compounds known to display functional selectivity at the 5-HT2CR. It acts as an inverse agonist on phospholipase A2 and the inhibitory G-protein, Gαi, and produces agonist effects on phospholipase C (De Deurwaerdère et al., 2004). Functionally selective compounds and allosteric modulators offer the exciting possibility of modulating select signaling pathways, thereby reducing the chances or severity of unwanted side effects, which may result from complete receptor blockade or activation (Conn et al., 2014; Kenakin, 2015).
SB242084 treatment does not result in a generalized hyperlocomotor or hyperarousal state, nor does it alter the estimation of reward value or response effort. Instead, it enhances response vigor during the ongoing pursuit of available goals. This suggests a complex interaction between serotonin and DA that regulates the dynamics of motivated behavior. Remarkably, the drug only increased DA levels in DMS when effortful responding was needed to obtain a goal and that goal was available. 5-HT2CRs are present in cortical and subcortical structures, including the midbrain, striatum, and ventral pallidum (Anastasio et al., 2010; Bubar et al., 2011; Graves et al., 2013). Because our studies involved systemic treatment with SB242084, we are unable to identify the specific circuit(s) mediating the 5-HT2CR-DA interaction. Based on previous studies in which 5-HT2C selective compounds were locally infused, there are at least two possibilities. Within the midbrain, the inhibitory action of SB242084 on GABA interneurons could reduce the GABAergic inhibitory control of DA neurons, resulting in increased DA cell activity and consequently greater DA release in striatum (Invernizzi et al., 2007). Alternately, the inhibitory effect of SB242084 on GABA neurons in striatum that project to the midbrain may regulate striatal DA release (Burke et al., 2014). To determine the circuit-specific effect of SB242084 on motivated behavior, localized infusion studies, or region 5-HT2CR deletions studies would have to be performed during behavioral testing. A more detailed understanding of how SB242084 increases effort-related DA release may permit the development of treatments for patients with debilitating motivational deficits.
Footnotes
This work was supported by the National Institute of Mental Health Grant MH104718 to E.H.S. and Grant MH068073 to P.D.B., National Institute on Drug Abuse Grant DA022340 to J.F.C., and The Pew Charitable Trusts to E.P.B. We thank Ina Filla for helping to establish in vivo microdialysis in the laboratory.
The authors declare no competing financial interests.
- Correspondence should be addressed to Dr. Eleanor H. Simpson, Columbia University/New York State Psychiatric Institute, 1051 Riverside Drive, Box 87, New York, NY 10032. es534{at}cumc.columbia.edu