Abstract
Dopaminergic neurons contribute to intracranial self-stimulation (ICSS) and other reward-seeking behaviors, but it is not yet known where dopaminergic neurons intervene in the neural circuitry underlying reward pursuit or which psychological processes are involved. In rats working for electrical stimulation of the medial forebrain bundle, we assessed the effect of GBR-12909 (1-[2-[bis(4-fluorophenyl)-methoxy]ethyl]-4-[3- phenylpropyl]piperazine), a specific blocker of the dopamine transporter. Operant performance was measured as a function of the strength and cost of electrical stimulation. GBR-12909 increased the opportunity cost most subjects were willing to pay for a reward of a given intensity. However, this effect was smaller than that produced by a regimen of cocaine administration that drove similar increases in nucleus accumbens (NAc) dopamine levels in unstimulated rats. Delivery of rewarding stimulation to drug-treated rats caused an additional increase in dopamine concentration in the NAc shell in cocaine-treated, but not GBR-12909-treated, rats. These behavioral and neurochemical differences may reflect blockade of the norepinephrine transporter by cocaine but not by GBR-12909. Whereas the effect of psychomotor stimulants on ICSS has long been attributed to dopaminergic action at early stages of the reward pathway, the results reported here imply that increased dopamine tone boosts reward pursuit by acting at or beyond the output of the circuitry that temporally and spatially summates the output of the directly stimulated neurons underlying ICSS. The observed enhancement of reward seeking could be attributable to a decrease in the value of competing behaviors, a decrease in subjective effort costs, or an increase in reward-system gain.
Introduction
Intracranial self-stimulation (ICSS) was the subject of the very first experiment on the role of dopamine in reward seeking (Crow, 1970) and has continued to contribute heavily to the study of brain reward circuitry. ICSS is typically measured in the curve-shift (Edmonds and Gallistel, 1974; Edmonds and Gallistel, 1977; Miliaressis et al., 1986) or progressive-ratio (Hodos, 1961) paradigm. It has been demonstrated recently (Arvanitogiannis and Shizgal, 2008; Hernandez et al., 2010; Trujillo-Pisanty et al., 2011) that neither method provides sufficient isolation of the different processes underlying reward seeking to distinguish between competing hypotheses concerning the variables to which dopamine neurons contribute, which include the sensitivity and gain of brain reward circuitry (Hernandez et al., 2010) and subjective effort cost (Salamone et al., 2005; Niv et al., 2007).
The ambiguity inherent in curve-shift and progressive-ratio measures is reduced by measuring ICSS as a function of both the strength and cost of reward (Arvanitogiannis and Shizgal, 2008; Hernandez et al., 2010). This method, which produces a three-dimensional (3D) structure called the “reward mountain,” can distinguish between changes in the sensitivity of the reward circuitry and changes in a diverse set of variables that includes reward-circuit gain, subjective effort cost, and the value of alternate activities, such as grooming, exploring, and resting. Sensitivity of the ICSS substrate is indexed by the pulse frequency required to produce a reward of half-maximal subjective intensity; it is analogous to the affinity of a drug for a receptor. Gain indexes the maximum rewarding effect achievable; it is analogous to the relationship between the number of available receptors and the magnitude of a drug effect.
Application of the 3D measurement method has challenged the long-standing hypothesis that psychomotor stimulants, such as cocaine, increase the sensitivity of the ICSS substrate (Crow, 1970; Esposito et al., 1978; Wise, 1980). By measuring displacement of the reward mountain by cocaine, Hernandez et al. (2010) showed that some combination of changes in gain, subjective effort costs, and the value of alternate activities is responsible for the drug-induced enhancement of ICSS performance. Although that experiment achieves greater specificity at the behavioral level than previous studies using two-dimensional (2D) measurement methods, the results are ambiguous at the neurochemical level because cocaine blocks the transporters for dopamine, norepinephrine, and serotonin (Iversen, 2000).
In the present study, we isolated the contribution of dopamine tone by measuring displacement of the reward mountain in response to GBR-12909 (1-[2-[bis(4-fluorophenyl)-methoxy]ethyl]-4-[3- phenylpropyl]piperazine) (GBR), a drug that blocks the dopamine transporter (DAT) with high specificity (Andersen, 1989). In parallel, we used in vivo microdialysis to measure the effect on dopamine tone in the nucleus accumbens (NAc) of GBR, cocaine, and their interaction with rewarding brain stimulation. The behavioral and neurochemical effects of GBR mimic some, but not all, of the effects of cocaine, thus implicating increased dopamine tone in the enhancement of reward pursuit by psychomotor stimulants and suggesting synergistic roles for dopamine and norepinephrine in the neurochemical and behavioral effects of cocaine.
Materials and Methods
Microdialysis experiments
Subjects.
Twenty-four male Long–Evans rats (Charles River) weighting between 350 and 400 g at the moment of surgery served as subjects for the microdialysis experiments. The rats were housed individually in hanging cages and maintained on a 12 h reverse light/dark cycle (lights off from 8:00 A.M. to 8:00 P.M.), with ad libitum access to water and food (Purina Rat Chow).
Surgery.
Atropine sulfate (0.5 mg/kg, s.c.) was administered to reduce bronchial secretions before induction of anesthesia with a ketamine (10 mg/kg, i.p.)/xylazine (100 mg/kg, i.p.) mixture. The topical anesthetic, xylocaine, was applied prophylactically to the external auditory meatus to reduce discomfort that could arise from the ear bars after the rat was mounted in the stereotaxic frame. Isoflurane was used to maintain anesthesia. A 20 gauge guide cannula (Plastics One) for microdialysis was aimed stereotaxically at the NAc septi [1.5 mm anteroposterior (AP), ±2.8 mm mediolateral (ML), and −5.4 mm dorsoventral (DV) from skull at a 10° angle].
GBR blocks the DAT with a much longer half-life than cocaine (Menacherry and Justice, 1990). To bring the time courses of the effects produced by the two drugs into closer concordance, cocaine was administered continuously. A route for continuous administration was established by implanting perforated Tygon tubing subcutaneously, as described previously (Hernandez et al., 2008). In 13 of these subjects, a monopolar stimulating electrode was aimed at the lateral hypothalamus (LH; −2.8 mm AP, 1.7 mm ML, and −8.8 mm DV from skull) ipsilateral to the cannula. The electrode was made of stainless-steel wire (0.25 mm diameter) and insulated with Formvar except for the region extending 0.5 mm from the tip. The anode consisted of two stainless-steel screws fixed in the skull, around which the return wire was wrapped. The electrode and the cannula were secured with dental acrylic and skull-screw anchors. At the end of the surgery, the rats were injected with buprenorphine (0.05 mg/kg, s.c.) to reduce pain and with a sterile saline solution (1 ml/kg, s.c.) to provide fluid replacement. The rats were allowed to recuperate for 5–7 d after surgery before any experimental manipulation.
Self-stimulation training.
Each of the rats implanted with stimulating electrodes was shaped to lever press for a 0.5 s train of cathodal, rectangular, constant-current pulses, 0.1 ms in duration. Shaping took place in a Plexiglas operant chamber (30 cm long × 21 cm wide × 51 cm high) equipped with a retractable lever located on the right wall of the box and a cue light positioned 1.5 cm above the lever. A continuous reinforcement schedule was in force. The self-stimulation training was performed as in previous experiments (Hernandez et al., 2006, 2007). Once the rat pressed the lever consistently for currents between 125 and 400 μA, a time-allocation versus pulse-frequency curve was obtained by varying the stimulation frequency across trials over a range that drove the number of rewards earned from maximal to minimal levels; the pulse frequency was decreased from trial to trial by 0.08 log10 units. The series of trials conducted to obtain a time-allocation versus pulse-frequency curve is called a “frequency sweep.” The frequency used during the subsequent microdialysis sampling was 1 log10 unit greater than the lowest frequency that supported a maximal response rate, as determined from the time allocation-frequency curve.
In vivo microdialysis.
Testing was conducted in similar operant chambers to the ones used during training but with the levers removed. Each testing chamber was housed in a dark Styrofoam-lined enclosure with a small opening at the top. All testing took place during the dark phase of the circadian schedule. The methodology for microdialysis sampling has been described in detail previously (Hernandez et al., 2006, 2008). Dialysate samples were collected every 20 min and immediately analyzed. Baseline was defined as a series of three consecutive microdialysis samples in which the dopamine concentration fluctuated by ≤5%. After the baseline was determined, a single intraperitoneal injection of GBR (10 mg/kg) was given or the subcutaneous infusion of cocaine began (1.75 mg · kg−1 · h−1). For the rats that were not implanted with electrodes, dialysate samples were collected for an additional 360 min. In the cocaine- and GBR-treated rats, samples were collected for an additional 200 or 120 min, respectively, before stimulation commenced. After the onset of the stimulation, sampling continued for 140 or 240 min, respectively.
Stimulation during microdialysis sessions.
During the microdialysis sessions, stimulation trains were delivered according to a variable-time, 12-s schedule of reinforcement. The schedules were programmed using LabVIEW software (National Instruments) installed on an IBM laptop computer. The intervals constituting the variable-time schedule were drawn from lagged exponential distributions with a mean of 11 s; a fixed lag of 1 s was added to each interval to prevent the stimulation trains from overlapping in time or occurring at very short temporal offsets. The stimulation was delivered by a Master 8 pulse generator (A.M.P.I.), which drove a constant-current amplifier (Mundl, 1980). The stimulation current was monitored with an oscilloscope (Tektronix TDS2014), which read the voltage drop across a 1 kΩ resistor (1% precision) in series with the electrode.
Analytical chemistry.
A description of the procedures for analytical chemistry has been described previously (Hernandez et al., 2006).
Histology.
After completion of the experiment, a lethal dose of sodium pentobarbital was administered. If a stimulation electrode had been implanted, iron was deposited at the site of the electrode tip by passing a 1 mA current for 15 s, with the electrode as the anode and the skull screws as the cathode. The animals were then perfused intracardially with 0.9% sodium chloride, followed by 10% Formalin; if electrodes were implanted, a Formalin–Prussian Blue solution (10% Formalin, 3% potassium ferricyanide, 3% potassium ferrocyanide, and 0.5% trichloroacetic acid) was used in lieu of 10% Formalin. The Formalin-Prussian Blue solution forms a blue precipitate from the iron particles deposited at the electrode tip. After perfusion, the animals were decapitated, and the brains were removed from the skulls and fixed with 10% Formalin solution for at least 7 d. Coronal sections, 40 μm thick, were cut with a cryostat (Thermo Fisher Scientific). The probe and electrode-tip locations were determined microscopically at low magnification with reference to the stereotaxic atlas of Paxinos and Watson (2007). Histological reconstructions show that the probe tips were located within the shell region of the NAc (Figs. 1, 2a), and the electrode tips were located in the LH (Fig. 2b).
Behavioral experiment
Subjects were 10 male Long–Evans rats (Charles River) weighting between 350 and 400 g at the time of surgery. They were housed and fed as described above.
Surgery.
The subjects were prepared for surgery as described above, with the exceptions that stimulating electrodes were aimed bilaterally at the LH (−2.8 mm AP, 1.7 mm ML, and −8.8 mm DV from skull), and no cannula was implanted. The monopolar stainless-steel electrodes were constructed as described above. A 5–7 d period was provided for postsurgical recuperation before the self-stimulation training began.
Self-stimulation training and stabilization.
Self-stimulation of both LH sites was assessed, and the electrode that supported the most vigorous lever pressing in the absence of motor side effects was chosen for additional testing. Shaping was done as described above. A cumulative handling-time schedule of reinforcement (Breton et al., 2009b) controlled the delivery of rewarding stimulation. Under this schedule, a reward is delivered when the cumulative time that the lever has been depressed reaches a value set by the experimenter (the “price” of the reward). Depression of the lever was accompanied by illumination of the neighboring cue light. As soon as the rat satisfied the response criterion, the lever was retracted, and a stimulation train was delivered. After a 2 s delay, the lever was reintroduced into the cage, the cumulative timer was reset to zero, and the rat could resume working to obtain another reward.
Each trial consisted of a fixed time during which the price and pulse-frequency parameters were held constant. The duration of each trial was sufficient to allow a rat that allocated all of its time to lever pressing to harvest 20 rewards. At the end of each trial and before the start of the next one, the lever retracted for 10 s, and the house light flashed. Two priming trains were delivered during the final 2 s of the intertrial interval. The priming stimulation was held constant across trials and was delivered at a pulse frequency that had been shown previously to support vigorous responding; the remaining parameters were the same as those used during the test trials.
During the initial training, the price of the reward was increased from 1 s of cumulative lever depression to 4 s, the value that would be used during the frequency sweeps throughout the saline condition of the experiment. This price (i.e., opportunity cost) was selected because, at this and greater values, objective and subjective prices have been shown to correspond closely (Solomon et al., 2007). Once performance stabilized across successive frequency sweeps, “price-sweep” testing commenced. During price sweeps, the pulse frequency was set to the maximum value used during the frequency sweeps, and the price of the reward was increased successively from trial to trial. Once performance stabilized across successive price sweeps, “radial-sweep” testing commenced. At each step along a radial sweep, the pulse frequency was decreased and the price was increased.
Two sweeps of each type were run during every stabilization session. We use the term “survey” to refer to the combination of a frequency sweep, a price sweep, and a radial sweep; these provide the minimal dataset required to fit the mountain model. The sequence of sweeps was random within session for subjects GBR2–GBR8 and random within survey for GBR11–GBR14. These two randomization approaches differ in terms of the condition for repeating a particular sweep. In the survey method, a sweep could be repeated only when a set, consisting of one instance of each sweep type, had been completed. In the session method, a given sweep type might be tested twice before one or both of the others had been tested in that session. The survey approach makes it possible to fit two mountains to the data from a single session, whereas the session method allows only one to be fit. Thus, the survey approach was introduced to increase the power of the resampling-based surface-fitting approach.
Self-stimulation testing.
The pharmacological treatment began after stable performance was achieved in stabilization sessions that included all three sweep types. GBR was dissolved in sterile saline at a concentration of 10 mg/ml and adjusted to a pH of 5 ± 0.1 by means of the addition of 0.1 m NaOH. The drug solution was injected intraperitoneally at a dose of 10 mg/kg, and the vehicle solution, also injected intraperitoneally, consisted of sterile physiological saline (0.9%).
Vehicle sessions were run on Mondays and Thursdays and were composed of two sets of frequency, price, and radial sweeps. The order of the sweeps was randomized in the same manner as in the stabilization sessions.
Drug sessions were run on Tuesdays and Fridays. Because of the effect of the drug on the position of the mountain, it was necessary to structure these sessions in a different manner than the vehicle sessions. As described below, the drug generally shifted the mountain rightward along the price axis. For ease of comparison between the two datasets, one frequency sweep was run at the same price as that used in the vehicle sessions, and a second frequency sweep was added at a higher price estimated to offset the shift produced by the drug. The shift necessitated testing higher prices, and the additional time involved, coupled with the addition of the second frequency sweep, made it unfeasible to collect two complete sets of sweeps in a single session. In the case of rats GBR2–GBR8, multiple sessions were required to obtain a single survey of the mountain under the influence of GBR; in the case of rats GBR11–GBR14, each drug session provided one complete survey (one sweep of each type). In the price sweeps performed with rat GBR6, there was no drug-induced shift to offset, and thus it was not possible to generate a high-price frequency sweep for this rat. Thus, in this subject only, a complete survey of the mountain in the drug condition consisted of only three sweeps.
The self-stimulation tests began 2 h after the GBR or saline injection. The first determination of the time-allocation versus frequency curve was considered a warm-up and was not included in the analysis. The collection of the behavioral data was restricted to the period when the GBR-induced elevation in dopamine concentration had been shown to be stable by means of the microdialysis data reported below. After the first week of experimentation, a preliminary fit of the mountain model to the data was performed, and the results were used to adjust the tested values of pulse frequency and price so as to optimize sampling. The new values were selected to accommodate the drug-induced displacement of the 3D structure and to select the price for the high-price frequency sweep that was included in the drug condition. The price in question was chosen to offset the effect of the drug so that, in the plot representing time allocation as a function of pulse frequency, the high-price frequency sweep performed in the GBR condition would overlap the plot obtained at the lower price used in the saline condition. Thus, the price used for the high-price frequency sweep exceeded the price used for the frequency sweep in the vehicle condition by an amount equal to the estimated drug-induced shift of the 3D structure along the price axis.
Statistical treatment of behavioral data.
The 3D model was fit separately to the data from the vehicle and drug sessions using the nonlinear least-squares routine in the MATLAB Optimization Toolbox (MathWorks). The fitting approach has been described in detail previously (Hernandez et al., 2010). The objective was to obtain unbiased estimates of the parameters of the reward-mountain surface and their dispersions without making unrealistic assumptions about normality and lack of correlation between parameter values. Our approach is based on resampling (Efron and Tibshirani, 1993). Multiple datasets (1000) are generated by randomly sampling the time-allocation values with replacement. The mountain model is fit to each dataset or to each of its component surveys, and descriptive statistics (means and 95% confidence intervals) are generated for each parameter.
The mountain model has two location parameters (Arvanitogiannis and Shizgal, 2008; Hernandez et al., 2010), which set the position of the 3D structure along the pulse-frequency and prices axes. The pulse frequency at which reward intensity is half-maximal is designated Fhm, whereas Pe designates the price at which the proportion of time allocated to pursuit of a maximal reward falls halfway between the base and summit of the mountain. The 3D structure is considered to have shifted when zero falls outside the 95% confidence interval about the difference between the estimates for a location parameter obtained in the drug and vehicle conditions.
The resampling method used depended on how the drug sessions were structured. In the cases of rats GBR11–GBR14, each drug session included a complete survey of the mountain. In these cases, the data were resampled by survey (Hernandez et al., 2010; Trujillo-Pisanty et al., 2011). For example, 10 drug sessions were run with rat GBR11. One thousand datasets, consisting of 10 sessions each, were generated by random resampling with replacement. One such dataset might consist of data from sessions 2, 3, 3, 4, 5, 5, 7, 8, 10, 10, and another might consist of data from sessions 1, 2, 3, 4, 4, 6, 6, 7, 8, 9. To construct surveys for rats GBR2–GBR8, we followed the same procedure used by Hernandez et al. (2010). Three pools were constructed, consisting of all sweeps of a given type. Surveys were built by drawing one sweep randomly from each of the pools. The number of surveys in each of the resampled datasets was equal to the number of sessions run in the drug condition.
Two different versions of the mountain model were fit to the data. One has six parameters [two location parameters, two slope parameters, and two scale parameters (maximum and minimum time allocation)], whereas the other includes a seventh parameter that can be interpreted to represent the conditioned value of reward-related stimuli or reward-seeking actions (Hernandez et al., 2010). Each of these models was fit in two different ways. The “location-specific” method entails estimating location-parameter values for each individual survey. This method defends the location-parameter estimates against the bias introduced by within-condition shifts of the mountain. The “all-common” method estimates a single set of location parameters for each dataset, thus minimizing the number of parameter values estimated.
Before resampling, both versions of the mountain model were fit to the data using both the location-specific and all-common methods. The Akaike information criterion (AIC) (Akaike, 1974) was then measured, and its value was used to select the best model and fitting method. The subsequent resampling made it possible to refine the parameter estimates and to measure their dispersion.
Inferential statistics and graphs were based on the surfaces defined by the mean parameter estimates and 95% confidence intervals derived from the resampling procedure for the model and fitting method deemed best by the AIC. Graphs were plotted using Origin (OriginLab Corp.).
Histology.
The histological procedure was performed as described above. The electrode tips were located in the LH (Fig. 3).
Results
Microdialysis experiments
The goal of the first experiment was to ascertain whether the changes in extracellular dopamine produced by a single intraperitoneal injection of GBR (10 mg/kg) resemble those produced by the lowest effective dose of continuous, subcutaneously administered cocaine (Hernandez et al., 2010). In addition, we wanted to replicate previous studies that suggest that a single injection of GBR is sufficient to produce a long-lasting and stable increase in extracellular dopamine (Rothman et al., 1991; Budygin et al., 2000; Gagnaire and Micillino, 2006), a necessary condition for running the mountain experiment.
As shown in Figure 4a, both a single intraperitoneal injection of GBR (10 mg/kg, n = 6) and continuous subcutaneous infusion of cocaine (1.75 mg · kg−1 · h−1, n = 5) increased dopamine levels in the NAc shell. In the case of the behavioral results described below, data acquisition began 120 min after administration of GBR. In the microdialysis data shown in Figure 4a, dopamine levels had approached asymptote by that time point in response to both drug treatments and remained quite stable for an additional 4 h. The two drug-administration regimens appear fairly well matched in terms of the asymptotic concentrations of dopamine in the dialysate, which were 210 and 190% of baseline for GBR and cocaine, respectively (means of observations obtained 120–360 min after onset of drug treatment).
Figure 4b depicts the time course of changes in dopamine concentration observed during delivery of medial forebrain bundle (MFB) stimulation in two additional groups of drug-treated subjects (GBR, n = 6; cocaine, n = 7). In previous behavioral testing, the rats had worked vigorously for identical stimulation trains, indicating that these trains were rewarding. Conditions before stimulation onset were the same as those in force when the data in Figure 4a were obtained, and reasonably similar results were observed. Again, the two drug-administration regimens appear fairly well matched in terms of their effects on the asymptotic levels of dopamine in the dialysate, which were 246 and 232% of baseline values for GBR and cocaine, respectively (means of observations obtained 60–120 and 120–200 min after onset of GBR and cocaine treatment, respectively). After these plateaus in dopamine levels were observed, delivery of MFB stimulation commenced (GBR, light-blue background; cocaine, light-red background). Whereas the MFB stimulation failed to further boost dopamine levels in the GBR-treated rats (blue time course), it markedly increased dopamine concentration in the cocaine-treated rats (red time course), reaching a second plateau at 439% of the baseline values (mean of last four observations). A repeated-measures ANOVA was performed by means of Statistica (Statsoft) on the dopamine concentrations measured during the time when brain stimulation reward (BSR) was delivered to both groups (220–340 min after the onset of drug treatment). The across-group difference in dopamine levels meets the criterion for statistical significance (F(1,11) = 5.85, p = 0.03), as does the interaction between the sample time and drug treatment (F(6,66) = 2.53, p = 0.02).
Behavioral experiment
Surface fitting
Table 1 shows the AIC values for the fits to six-parameter and seven-parameter models performed using the all-common or location-specific approach. In the vehicle condition, the seven-parameter model proved best in 6 of 10 cases, despite the penalty imposed by the AIC for additional parameters. The location-specific method, which better defends the estimates of the slope parameters against the bias introduced by within-condition shifts, proved superior in 8 of 10 cases. In the drug condition, the seven-parameter model provided the best fit in only 3 of 10 cases, and the location-specific method fared best in 9 of 10 cases. The adjusted R2 for the best-fitting surfaces for the vehicle mountain ranged from 0.954 to 0.988 and from 0.930 to 0.970 for the GBR mountain surface. These values suggest that the 3D surfaces fit the time-allocation data well.
2D representation
Figure 5 shows 2D projections of the fitted surface and the behavioral data for rat GBR12. We show these 2D projections to facilitate comparison with the results of previous studies using curve-shift or progressive-ratio scaling. However, previous papers describing the mountain model and its application (Arvanitogiannis and Shizgal, 2008; Hernandez et al., 2010) demonstrate that a 3D surface must be fit to the data to determine how the mountain has been displaced by experimental treatments, such as drug administration. This point is made with particular clarity in the movies referenced in the note below. Displacement of the mountain cannot be discerned unambiguously by means of visual inspection of 2D projections (Arvanitogiannis and Shizgal, 2008; Hernandez et al., 2010).
To obtain the data shown in Figure 5, the pulse frequency was decreased and/or the price was increased from trial to trial sequentially (“swept”). a shows the frequency-sweep curves obtained, at a price of 4 s, for the vehicle (red) and GBR (pink) conditions. This panel represents the data in the same manner as in conventional curve-shift studies. The data points obtained in the GBR condition (pink) are displaced to the left of those obtained in the vehicle condition (red), as would be expected on the basis of previous studies (Maldonado-Irizarry et al., 1994; Melnick et al., 2001). However, when the price at which the mountain is sectioned is increased, the two sets of data points overlap closely, as shown in b. In c, price-sweep data are shown in lighter blue for the vehicle condition and in darker blue for the GBR condition. GBR produced a rightward displacement of the price-sweep curves. The radial-sweep data from the vehicle (light green) and GBR (dark green) conditions are plotted against the pulse-frequency axis in d and against the price axis in e.
3D representation
Our 3D analysis of performance for BSR (Arvanitogiannis and Shizgal, 2008; Hernandez et al., 2010) reveals that 2D depictions, such as those in Figure 5, are fundamentally ambiguous with regards to the direction in which a drug treatment has shifted the mountain. At pulse frequencies that produce submaximal rewarding effects, the surface of the mountain is oriented diagonally. Thus, displacement of a 2D section, such as the curves in Figure 5, could arise from a shift in the mountain along the depicted x-axis, a shift along the unseen, orthogonal, independent-variable axis, or both. This ambiguity is resolved by plotting the data in a 3D space, as shown in Figure 6, which shows the data from Figure 5 in this new perspective. Mean time allocation is represented by the spherical symbols; the wire mesh depicts the surface obtained by fitting the reward mountain model to the vehicle data (a) and to the GBR data (b). Figure 7 shows the same data replotted as contour graphs constructed by plotting on a plane the cross-sections obtained by horizontally slicing the fitted surface at fixed intervals representing 10% changes in time allocation. To facilitate comparison with the contour graph and data from the GBR condition (b), the contour graph and data from the vehicle condition are shown twice (a and a′). The pulse frequencies and prices sampled along the pulse-frequency, price, and radial sweeps are indicated by the circular symbols. The price sweep constrains the position of the mountain along the price axis, whereas the frequency sweep constrains the position along the pulse-frequency axis. The radial sweep determines the curvature of the contour lines while providing additional positioning information, which is most precise when the radial sweep passes through the point defined by the two location parameters (Fhm, Pe).
The superimposed solid lines in Figure 7 represent the location parameters (Fhm, red; Pe, blue) whereas the dashed lines represent the 95% confidence interval around each of the parameter estimates. As indicated by the blue arrow, GBR moved the structure rightward along the price axis by 0.125 log10 units (thus, increasing Pe to 1.33 times the value obtained in the vehicle condition) but moved it downward along the pulse-frequency axis by only 0.009 log10 units (to 0.98 of the value obtained in the vehicle condition). The bar graph summarizes the displacement of the mountain along the two axes, and the error bars represent 95% confidence intervals surrounding the change in the parameter estimates. The 95% confidence interval surrounding the change in Fhm (red) includes zero, and thus the tiny displacement along the pulse-frequency axis fails to meet the criterion for statistical reliability. In contrast, zero falls well outside the 95% confidence interval surrounding the change in Pe, and thus, the much larger displacement of the mountain along the price axis easily meets the criterion for statistical reliability.
The fact that the radial sweeps in the vehicle and drug conditions pass slightly to either side of the intersection of the location parameters (Fig. 7a,a′,b) makes the 2D depictions of these sweeps (Fig. 5d,e) ambiguous with respect to the direction in which the mountain has shifted. This ambiguity is resolved by the contour-graph representation (Fig. 7), which shows clearly that mountain was shifted rightward along the price axis by GBR.
Figure 8 and Table 2 summarize the movement of the mountain along the pulse-frequency and price axes for all the subjects tested. In 7 of the 10 experimental subjects, the displacement of the mountain along the price axis is statistically reliable, and, in an eighth case, the shift falls just short of the criterion. The displacements vary across subjects between −0.013 and 0.367 log10 units, and the average displacement is 0.139 log10 units (SEM = 0.034). This means that, on average, the price at which the rats allocated half of their time in pursuit of a maximally intense reward strength was 1.38 times higher after administration of GBR (10 mg/kg, i.p.) than after administration of the vehicle.
The displacement of the mountain along the pulse-frequency axis varies across subjects from −0.041 to 0.23 log10 units; the average displacement is 0.021 log10 units (SEM = 0.025). This means that, on average, the frequency that produced a half-maximal reward intensity was merely 1.05 times higher under the influence of GBR than in the vehicle condition. As shown in Figure 8, none of the displacements along the pulse-frequency axis meet the criterion for statistical reliability.
Discussion
ICSS can be altered by drug action at different stages of the underlying neural circuitry (Gallistel, 1978; Gallistel et al., 1981; Arvanitogiannis and Shizgal, 2008; Hernandez et al., 2010). The first event is a volley of action potentials in the directly stimulated neurons, the effects of which are integrated spatially and temporally (Gallistel, 1978; Gallistel et al., 1981; Simmons and Gallistel, 1994; Sonnenschein et al., 2003) to yield a neural signal representing reward intensity. The drug-induced enhancement of ICSS by psychomotor stimulants was attributed initially to action on these early stages of the circuitry. For example, Wise (1980) proposed that drugs of abuse lower the threshold of the circuitry to exogenous excitation (presumably at the integration stage) or reduce the input required from the electrode attributable to pharmacological activation of the substrate. We refer to such effects as changes in the sensitivity of the reward substrate, the variable that determines the strength of the electrical input required to drive the rewarding effect to a given proportion of its maximal value (Hernandez et al., 2010). The function that maps the strength (e.g., pulse frequency) of the stimulation into the intensity of the rewarding effect is called the reward-growth function (Leon and Gallistel, 1992). Changes in sensitivity displace this function along the strength axis just as changes in the affinity of a drug for a receptor displace the concentration–effect curve along the concentration axis.
Arvanitogiannis and Shizgal (2008) and Hernandez et al. (2010) demonstrated that 2D measurement methods, such as the curve-shift or progressive-ratio method, cannot distinguish changes in sensitivity from changes in a set of variables that includes reward probability, subjective effort cost, the value of alternate activities, such as grooming, resting, and exploring, and reward-system gain, the variable that sets the vertical scale of the reward-growth function.
The reward-mountain method entails measurement of ICSS performance as a function of both stimulation strength (pulse frequency) and opportunity cost (price). Changes in sensitivity shift the mountain along the strength axis, whereas changes in gain, reward probability, subject effort costs, or the value of alternate activities shift the mountain along the price axis. Changes in sensitivity are attributable to actions before the output of the reward-growth function, whereas changes in gain, probability, subjective effort costs, and the value of alternate activities are attributable to actions downstream from this point in the reward circuitry. Manipulations that act before the output of the reward-growth function include changes in current, which alter the number of directly activated neurons, and changes in the train duration, which can alter the pulse frequency required to drive the output of the integrator to a particular level. Both manipulations shift the mountain along the pulse-frequency (strength) axis (Arvanitogiannis and Shizgal, 2008). Probability discounting acts downstream from the output of the reward-growth function and shifts the mountain along the price axis (Breton et al., 2009a).
In contrast with the explanation advanced in early studies of the effects of cocaine on ICSS (Crow, 1970; Esposito et al., 1978), Hernandez et al. (2010) showed that the principal effect of this drug is to shift the mountain along the price axis. This finding narrows down the stages of processing at which cocaine could be producing its performance-enhancing action but leaves open multiple explanations at the pharmacological level because cocaine blocks the norepinephrine transporter (NET) and serotonin transporter (SERT), as well as the DAT (Iversen, 2000).
In contrast to cocaine, GBR produces a highly specific blockade of the DAT (Andersen, 1989). We show here that, like cocaine, GBR shifts the reward mountain rightward along the price axis. Thus, the present findings implicate dopamine in the potentiation of performance for BSR by means of one or more actions at or beyond the output of the reward-growth function. These actions could include boosting reward-system gain, reducing subjective effort costs, and reducing the value of alternate activities.
Figure 9 shows that, although 10 mg/kg GBR (intraperitoneally) displaced the reward mountain in the same direction as in the study by Hernandez et al. (2010), the magnitude of the shifts (mean = 0.14 log10 units) was substantially smaller than in response to continuous subcutaneous infusion of cocaine (mean = 0.38 log10 units). Whereas the shifts produced by cocaine were statistically reliable in all seven rats tested in the study by Hernandez et al., the shifts produced by GBR met the statistical criterion in only 7 of 10 rats. The microdialysis data suggest that the similarities and differences in the behavioral effects of the two drugs reflect similarities and differences between their effects on monoaminergic neurons.
As shown in Figure 4a, 10 mg/kg GBR (intraperitoneally) produced an increase in dopamine levels in the NAc shell similar to that observed in response to continuous subcutaneous infusion of cocaine (1.75 mg · kg−1 · h−1). In contrast, delivery of rewarding MFB stimulation interacted differently with the two drugs. In the cocaine-treated rats, MFB stimulation produced an additional boost in dopamine concentration; in the GBR-treated rats, dopamine concentration failed to rise further during delivery of the stimulation (Fig. 4b). This difference could well account for the larger displacements of the mountain along the price axis produced by cocaine.
Cocaine-induced blockade of the NET could have contributed to the higher dopamine levels and larger price shifts observed in response to cocaine than to GBR by potentiating the excitation of the ventral tegmental area (VTA) dopamine neurons by NE neurons in the locus ceruleus (Grenhoff et al., 1993). The NE neurons, in turn, could have been excited by input from hypothalamic orexin neurons (Sutcliffe and de Lecea, 2002; Bonnavion and de Lecea, 2010; Henny et al., 2010) activated by the MFB stimulation.
Compounds acting at 5-HT1A receptors have been shown to yield effects on ICSS opposite to those produced by compounds acting at 5-HT1B and 5-HT2C receptors (Hayes and Greenshaw, 2011). Such opposing effects are consistent with a report that the effectiveness of rewarding MFB stimulation was not changed by moderate doses of the specific SERT blocker fluoxetine (Harrison and Markou, 2001). Thus, it is not clear what net effect, if any, would be contributed by blockade of the SERT during the regimen of cocaine administration used here.
The effects on manipulating dopamine neurotransmission on operant performance for reward have been attributed to changes in reward intensity (Crow, 1970; Esposito et al., 1978; Wise, 1980). Rival accounts are couched in terms of changes in the proclivity to invest effort in pursuit of reward (Salamone, 2002; Salamone et al., 2005) or in subjective vigor costs (Niv et al., 2007). Hernandez et al. (2010) have argued that, although the form in which Wise originally phrased the reward-intensity argument cannot explain the observed shifts along the price axis, his argument can be adapted to account for these effects by substituting changes in gain for changes in sensitivity. To distinguish this reformulation of Wise's argument from the effort-based accounts, it will be necessary to adapt the mountain method so as to estimate the function mapping the work required to obtain a reward into its subjective effort costs (Hernandez et al., 2010).
Rats (Witten et al., 2011) and mice (Adamantidis et al., 2011; Kim et al., 2012) will perform operant responses for direct optical activation of VTA dopamine neurons. How can these results be squared with the demonstrations that both cocaine and GBR shift the reward mountain along the price axis, an effect attributed to action beyond the output of the directly stimulated neurons?
The fine, unmyelinated axons of dopamine neurons have very high thresholds for activation by short-duration pulses of extracellular current (Yeomans et al., 1988) and are unlikely to constitute a major part of the directly activated substrate for MFB self-stimulation (Bielajew and Shizgal, 1986; Shizgal, 1997). Multiple glutamatergic (Geisler et al., 2007) and cholinergic (Oakman et al., 1995) pathways are positioned to relay activation of non-dopaminergic MFB fibers to dopamine somata in the VTA, and there is considerable empirical evidence that these inputs are driven by rewarding MFB stimulation (Rada et al., 2000; You et al., 2001; Sombers et al., 2009). By analogy to an hypothesis advanced by Moisan and Rompré (1998) Hernandez et al. (2010) proposed that dopaminergic somata and/or their afferents may integrate input from directly activated, non-dopaminergic MFB fibers. On this view, reward intensity is represented by the firing of the dopamine neurons, whether induced directly by optogenetic means or indirectly by electrical activation of afferent pathways. DAT blockade would rescale upward the synaptic output of the dopamine neurons, increasing reward-system gain and shifting the mountain along the price axis. This hypothesis could be tested by specific optogenetic activation or silencing of dopaminergic neurons or their afferents. Optogenetic activation and silencing could also test the hypothesis (Lin et al., 2007) that excitatory input to VTA dopamine neurons from locus ceruleus NE neurons makes a synergistic contribution to the rewarding effect of MFB stimulation. Thus, the combination of powerful new methods for altering signal flow in specific neural populations with the reward-mountain method should provide new insights about the neural circuitry underlying reward seeking.
Notes
Supplemental material for this article is available at http://spectrum.library.concordia.ca/974074/. This video reveals a fundamental source of ambiguity in 2D measurements of operant performance for reward, such as those obtained in the curve-shift and progressive-ratio paradigms. We show how the 3D portrayal provided by the reward-mountain model resolves this ambiguity. This material has not been peer reviewed.
Footnotes
This research was supported by Canadian Institutes of Health Research Grant MOP-74577 and the Concordia University Research Chairs Program (P.S.), a grant from the Research Fund of Québec–Health (Fonds de recherche du Québec—Santé) (to the Centre for Studies in Behavioural Neurobiology), and Mexican National Council of Science and Technology (CONACYT) Grant 209314 and Ministry of Education of Leisure and Sports of Québec Grant 140498 (under the Program of Grants of Excellence for Foreign Students) (I.T.-P.). David Munro built and maintained the computer-controlled equipment for experimental control and data acquisition. Software for experimental control and data acquisition was written and maintained by Steve Cabilio.
The authors declare no competing financial interests.
- Correspondence should be addressed to Peter Shizgal, Centre for Research in Behavioural Neurobiology, Concordia University, Room SP-244, 7141 Sherbrooke Street West, Montréal, Québec, Canada, H4B 1R6. peter.shizgal{at}concordia.ca