Abstract
Activity changes in a large subset of midbrain dopamine neurons fulfill numerous assumptions of learning theory by encoding a prediction error between actual and predicted reward. This computational interpretation of dopaminergic spike activity invites the important question of how changes in spike rate are translated into changes in dopamine delivery at target neural structures. Using electrochemical detection of rapid dopamine release in the striatum of freely moving rats, we established that a single dynamic model can capture all the measured fluctuations in dopamine delivery. This model revealed three independent short-term adaptive processes acting to control dopamine release. These short-term components generalized well across animals and stimulation patterns and were preserved under anesthesia. The model has implications for the dynamic filtering interposed between changes in spike production and forebrain dopamine release.
Introduction
During reward-based learning tasks in alert primates, single-unit recordings in midbrain dopamine neurons demonstrate that phasic activity encodes the ongoing difference between experienced reward and expected reward (prediction error) that complies with the basic assumptions of learning theory (Montague and Sejnowski, 1994; Montague et al., 1996; Schultz et al., 1997; Hollerman and Schultz, 1998; Schultz and Dickinson, 2000; Waelti et al., 2001). However, these electrophysiological data provide no information on the transformation from spike-encoded reward-prediction errors to dopamine delivery that modulates reward-seeking behavior (Phillips et al., 2003b). To bridge this gap, we made rapid electrochemical dopamine measurements in the striatum of freely moving rats while delivering multiple patterns of electrical stimulation to dopamine axons. This revealed rich dynamic adaptation of dopamine release that attends changes in spike production.
Previous work measuring dopamine release in anesthetized rats has demonstrated that after brief, intense stimulation of dopamine neurons, ∼20 min is required for complete recovery of dopamine release (Michael et al., 1987). These findings inspired a fixed amplitude model of dopamine release (Wightman et al., 1988) that was used to fit dopamine release data provided that stimulation frequencies remained extremely low. However, more recent work using less intense, but more rapidly repeated electrical stimuli, shows that rich, dynamic fluctuations occur in response to intracranial self-stimulation (Garris et al., 1999; Yavich and Tiihonen, 2000) or experimenter-delivered stimuli in vivo (Kilpatrick et al., 2000; Yavich and MacDonald, 2000) and in vitro (Cragg, 2003). Collectively, these observations suggested the hypothesis that multiple dynamic components modulate dopamine delivery, thus contradicting fixed-amplitude models of release (Wightman et al., 1988). We show here that a single dynamic model captures all the measured changes in ongoing dopamine release, provides evidence of a family of short-term adaptation mechanisms in dopamine release, and provides further insight into the role of this important neuromodulatory system.
Materials and Methods
In vivo measurements. Striatal dopamine was measured in rats using methods described previously (Garris et al., 1997; Phillips et al., 2003a). Animal care was approved by the Institutional Animal Care and Use Committee of the University of North Carolina. Stereotaxic surgery was performed to chronically implant (1) a bipolar stimulating electrode in the substantia nigra/ventral tegmental area, (2) a guide cannula above the ipsilateral caudate-putamen, and (3) an Ag-AgCl reference electrode. After full recovery, experiments were performed by lowering a carbon-fiber microelectrode through the guide cannula into a position in the caudate-putamen where dopamine release was optimized. Extracellular dopamine was measured by fast-scan cyclic voltammetry (-0.4 to +1.0 V to -0.4 vs Ag/AgCl, 300 V/sec) every 100 msec during patterns of electrical stimulation. Carbon-fiber microelectrodes were calibrated in vitro after use with a dopamine stock solution. In some experiments, measurements were made under urethane anesthesia (1.5 gm/kg; Wightman et al., 1988).
Formulation of the model. A previous model of dopamine delivery (Wightman et al., 1988) sets the rate of change of dopamine concentration (C) equal to: (dC/dt) = rate added through release - rate removed by uptake.
In its simplest form, a fixed concentration of dopamine Cp is released with each stimulus pulse, and uptake follows Michaelis-Menten kinetics characterized by a maximal velocity Vm and affinity constant, Km. This can be formulated as: 1
where r represents the impulse rate. However, at normal physiological impulse rates, the complex dynamics apparent in the experimental data cannot be captured with this simple approach (see variability of responses in Fig. 1B), and a more sophisticated approach is required.
Extracellular dopamine in the caudate-putamen of freely moving rats. A, A dopamine transient evoked by a repetitive electrical stimulation (24 pulses, 60 Hz, 120 μA; horizontal bar) delivered to dopamine neurons in the substantia nigra/ventral tegmental area with a bipolar electrode (left panel). The amplitude and duration of this concentration transient are quite similar to a subsecond dopamine transient elicited in another male rat after exposure to an estrous female (right panel). B, Dopamine concentration transients evoked by an irregular stimulation train where each stimulus (24 pulses; vertical bar) is a repetitive electrical stimulation identical to that delivered in A. The irregular stimulus pattern is a “playback” of an intracranial self-stimulation lever-press pattern of another rat. Both facilitation (compare amplitudes at 1 and 2) and depression (3) in the dynamics governing evoked release are apparent.
The parameters for dopamine reuptake (Vm and Km) have been shown to be fairly stable (Venton et al., 2003), consequently, we focused on the limitations implicit in the first term rCp, which characterizes dopamine release according to the average impulse rate r and the average release per impulse Cp. Deviating from this approach, we modeled the concentration of dopamine added by each impulse as a product of two time-dependent functions p(t)A(t). A(t) is a function that depends on independent facilitation and depression components Ij(t), and its initial value, a0. p(t) is a function that models the exact impulse times for a specific pattern of impulses. As detailed below, each Ij(t) possesses “kick and relax” dynamics.
In this new approach, as with the previous fixed amplitude model, the Michaelis-Menten variables for uptake (Vm and Km) were not used as free parameters in the model, but were instead maintained at empirically determined constant values of 4.0 μm/sec and 0.2 μm, respectively (Wightman and Zimmerman, 1990). The dynamic gain control of release is captured by the time varying function A(t); itself composed of separate “hidden” dynamic processes Ij. This approach is analogous to that of Abbott et al. (1997) and Varela et al. (1997), addressing short-term changes in glutamatergic transmission, except that all factors are multiplicative. 2
The kick and relax dynamics are straightforward. At each spike, Ij is multiplied (kicked) by a factor kj, so that, Ij is replaced by kj Ij (kj > 1 gives facilitation, whereas kj < 1 gives depression). During interstimulus periods, the independent dynamic factors decay to their equilibrium value of 1.0 with first order kinetics (time constant τj) so that their product A(t) decays back to its initial value a0: 3
As indicated above, the fixed rate r is replaced by a function p(t) that describes the entire pattern of impulses evoked in the dopamine neurons by the stimulating electrode. With these changes, Equation 1 becomes: 4
Data fitting. A direct integration approach was used to estimate the model (C) based on the measured dopamine fluctuations. The stimulation pattern p(t) was modeled explicitly as where
. The free parameters in this approach were the kick values kj, the associated time constants τj, the initial value a0, and the time constant of the electrochemical probe τR. The time constant τR accounts for the delays due to diffusion in the extracellular space and the response time of the electrode. That is, τR characterizes the low pass filtering that occurs with this type of electrochemical measurement (Bath et al., 2000). For each fit of the model to the measured dopamine levels, the number of dynamic components Ij and their polarity (facilitating or depressing) were preselected. The optimization of the fits used Powell's method with a fourth order integration scheme (Powell, 1964; Press et al., 1990). The fits produced the greatest error reduction for one facilitating factor and two depression components, and no substantial improvement for four or more components. In all cases, a large range of parameter values was explored by the fitting procedure. Furthermore, this direct integration approach yielded fitted values for the electrode time constant τR in the range of previous empirical measurements of its value (∼200 msec; Bath et al., 2000). In this direct integration approach, C was computed first followed by updating of the dynamic components Ij.
Results
Striatal dopamine release was evoked by electrical stimulation of dopaminergic cell bodies and monitored with fast-scan cyclic voltammetry at carbon-fiber microelectrodes. In freely moving rats, both regular and irregular temporal patterns of repetitive stimulation were used to elicit dopamine release. The neurochemical response to a single stimulus train was similar to dopamine release evoked by salient behavioral cues (Fig. 1A) (see also Robinson et al., 2002). The irregular patterns of stimulation were the recorded lever-press records of other animals during self-stimulation experiments (patterns taken from self-stimulation experiments described in Kilpatrick et al., 2000). As shown in Figure 1B, the richness of the dynamic influences on dopamine release is apparent for an irregular stimulation pattern in which both facilitation (arrow 2) and depression (arrow 3) of release, compared with the initial event (arrow 1), are evident. This depression is reversible, demonstrated by a significant recovery of release after 15 min (data not shown). Dynamic changes also emerge using regular interval patterns of stimulation (Fig. 2D-F).
Dynamic model reveals multiple adaptive mechanisms for ongoing dopamine release. A, Semi-log plot of the moralized error for fitting the model to experimental data versus the number of independent dynamic components Ij. Representative fits are shown from four representative animals. The error decreases up to three components and does not improve appreciably beyond that. B, The three dynamic components consistently captured by the model: short-lasting facilitation (top), short-lasting depression (middle), and longer-lasting depression (bottom). These time-dependent mathematical components are induced by each action potential (arrow) and multiplicatively modify the amplitude of dopamine release for future action potentials. C, Fit of dynamic model to dopamine released during a single repetitive electrical stimulation (24 pulses, 60 Hz, 120 μA) applied to dopamine neurons of an ambulant rat. In all panels, the model is magenta, data are blue, and r2 is the correlation coefficient of the goodness of fit. D, Fit of model to dopamine fluctuations evoked in an anesthetized animal by more intense stimulus trains. Each repetitive stimulus (vertical bar) consisted of 600 pulses (60 Hz, 120 μA). The repetitive stimulus trains were delivered at 2, 5, 10, or 20 min intervals. Depression is present that is still evident on the next stimulus with 2 min interstimulus intervals. At longer intervals there is greater recovery of release. Inset, Response for a single 600 pulse train (60 Hz, 120 μA). E, Fit of model to dopamine fluctuations evoked by stimuli as in C repeated at 2 sec intervals. Facilitation is apparent. F, Stimuli as in C repeated at 5 sec intervals. A gradual depression is apparent.
The model described in Materials and Methods was fit to these experimental data to determine values for the kick, kj, and the time constant, τj, for each independent dynamic component Ij, and a0, the initial estimate of dopamine released per stimulus pulse (Table 1). Error analysis was used to select the number of unique dynamic factors in any particular fit. Three key features emerged from this error analysis. (1) We find that there is a dramatic reduction in residual error as the number of independent components (I values) is increased from 1 to 3, and very little reduction thereafter (Fig. 2A). (2) The observed dopamine dynamics are consistently captured by one facilitating (kj > 1), and two depressing (kj < 1) components (Fig. 2B). (3) The number and type of the dynamic components does not change with anesthesia.
Best-fit parameter values
The model captures the extracellular dopamine dynamics for an individual repetitive stimulus (Fig. 2C) as well as interstimulus dynamics during regular patterned stimulation in both anesthetized (Fig. 2D) and freely moving animals (Fig. 2E,F). In alert animals, regular patterns of repetitive stimuli generated dopamine release profiles that revealed individual facilitation and depression suggested by the rich dopamine dynamics observed in Figure 1B. When stimulation trains are repeated every 2 sec, the first ∼10 trains elicit enhanced dopamine (Fig. 2E), whereas with 5 sec intervals there is decremented release (Fig. 2F). Good fits (r2 = 0.93 ± 0.06) to the three-component model were also found to irregular stimulus patterns in awake animals (Table 1). To test the limits of the system, experiments were done in anesthetized rats that permitted the use of much longer stimulations (600 pulses, 60 Hz) (Fig. 2D). Even under these circumstances, the model fits the changing amplitude of dopamine transients across the entire period of the experiment (>3 hr), as well as the shape of the individual responses to stimulation trains (Fig. 2D, inset). Furthermore, when the stimulation amplitude was increased from 120 to 300 μA, a0 increased proportionately, but all of the other parameters were preserved (data not shown).
The robustness of the model, as suggested by the error analysis, is also supported by the way in which the fitted parameter values generalize across animals, patterns of stimulation, and with anesthesia. The values for the facilitation and the shorter-term depression are summarized in Table 1. The longer-term depression time constant, τ3, was found to have a lower limit of 10 min from the short experiments. Longer experiments in anesthetized animals further constrained this value to ∼14 min, a result consistent with previous findings (Michael et al., 1987). The time constants extracted by the model were further verified by an alternative approximation (Appendix, see supplementary material, available at www.jneurosci.org) that also consistently found two depression terms (one short ∼3.5 sec; one long ∼12-15 min) and one facilitation term (short ∼4.5 sec).
In contrast to the consistency of the time constants, the kj (kick) parameters were more variable between recording locations and animals (Table 1) as is a0. Indeed, a0 was used as an adjustable parameter because it varies with the density of release sites adjacent to the specific recording site (Wightman and Zimmerman, 1990). Despite this variability, the model may be used to predict dopamine fluctuations in response to novel stimulus patterns (Fig. 3). The τj and kj parameters extracted from a regularly repeated (every 1 sec) stimulation trains (Fig. 3A) were used to predict dopamine fluctuations for a stimulation pattern repeated at irregular time intervals in a different rat (Fig. 3B). Remarkably, the model accurately predicts the changes in extracellular dopamine concentration (r2 = 0.96), including the facilitation seen initially and the almost complete attenuation of release at the end of the train. The generalization across animals and stimulation patterns is evident in Figure 3D, in which the short-term facilitation and depression time constants are plotted against one another. There is a clear clustering in this plot. Here, there are six points that represent fits to irregular patterns of stimulation along with seven other points that represent regular patterns in awake and anesthetized animals. The points for the fits in Figure 3A-C are shown as color-coded circles. The average for the entire ensemble of fits is shown with error bars extending along both axes.
Dopamine release for complex stimulus patterns. A, Best fit of model to dopamine fluctuations evoked by repetitive stimulus trains (24 pulses, 60 Hz) spaced regularly at 1 sec intervals. Facilitation is apparent. The model is magenta, data are blue, and r2 is the correlation coefficient of the goodness of fit. B, The parameters extracted from the fit to the data in A (in one animal) were used to predict dopamine fluctuations evoked by the repetitive stimulus delivered (in another animal) at irregular intervals at approximately the same rate (∼1 Hz). The prediction is magenta, data are blue, and r2 is the correlation coefficient of the goodness of fit. A more complete exploration of the parameter space for the data shown in B will improve the fit, but not dramatically indicating good generalization across animals. C, Another irregular pattern was fit by the model (model = magenta; measured dopamine = black). D, Plot of short-term depression versus facilitation time constants shows a consistent clustering. Each dot represents a fit to the measured dopamine fluctuations in a single animal. The red and green dots indicate the time constants used for the fits from A-C.
Discussion
These results establish that ongoing dopamine release is controlled by adaptive mechanisms rather than having a fixed amplitude. These are analogous to the numerous mechanisms of gain control observed at glutamatergic synapses with postsynaptic recordings (Markram et al., 1997; Bear, 1999; Malenka and Nicoll, 1999; Abbott and Nelson, 2000). Here we were able to avoid the confound of postsynaptic receptor changes while studying gain control of dopamine release (a presynaptic phenomenon) by using direct chemical measurements. The fluctuations of dopamine release could be described with a three-component dynamic model that robustly captured the short-term plasticity. The modeled facilitation and depression time scales were consistent across animals and stimulation patterns, and were not changed by anesthesia (Table 1, Figs. 2, 3).
Physiological relevance of electrical stimulation
In this study, dopamine release was evoked using electrical stimulation of dopamine cell bodies. The current used for most experiments (120 μA) produces ∼30-50% of maximal striatal dopamine release (Wiedemann et al., 1992), suggestive of synchronous activation of ∼50% of dopamine neurons in the ipsilateral substantia nigra and ventral tegmental area. The level of synchronicity observed for behaviorally salient stimuli is at least this high (Hollerman and Schultz, 1998; Hyland et al., 2002). The number of spikes in a burst can be numerous in alert rats (up to 20; Freeman et al., 1985), and the instantaneous frequency can be high, especially in response to a rewarding stimulus (27% of intraburst intervals have an instantaneous frequency of >50 Hz; Hyland et al., 2002). Although the dopamine increases were similar to that for a salient behavioral stimulus (Fig. 1A) (Robinson et al., 2002), the repetitive stimulation we used for most experiments (24 pulses, 60 Hz) represents an extreme and almost certainly supraphysiological activation. Nonetheless, this type of stimulation is useful to expose the underlying physiological dynamics. Because the model, which describes events on a single-impulse basis, generalizes well across different stimulation patterns and across experiments where different repetitive stimuli were used, it succeeds in capturing the physiology at its fundamental unit (a single action potential).
Possible physiological mechanisms underlying the observed dynamics
In the simplest scenario, each dynamic factor could be associated with an identifiable physiological process. Most experiments used submaximal stimulation current to avoid masking physiological effects at the cell body (e.g., hyperpolarization) that may effect action potential propagation. However, even with a supramaximal stimulation current, the dynamics of dopamine release were preserved, suggesting that their control resides exclusively in the terminal. The long-lasting depression is on the same time scale as that limited by the rate of dopamine biosynthesis and vesicular packaging (Michael et al., 1987). Inhibition of release through terminal autoreceptors decays with a time constant exactly in the range of the shorter-term depression (Phillips et al., 2002), and so this could be a major contributor to that component. However, it should be noted that short-lasting autoreceptor-independent depression has also been observed with (more promiscuous) local stimulation (Phillips et al., 2002; Cragg, 2003). Facilitation of dopamine may be a result of increased refilling of the readily releasable vesicular pool (Yavich and MacDonald, 2000), a function that is regulated by intracellular calcium (Wang and Kaczmarek, 1998). Indeed, calcium dynamics have been demonstrated to be an important factor that influences pulse-to-pulse dopamine release (Phillips and Stamford, 2000; Cragg, 2003). Changes in the rate of uptake could also modulate extracellular dopamine. However, the three component model based on plasticity of release captured the data remarkably well without alteration of uptake parameters. Future experiments are required to test these hypotheses and establish the precise control points of dopamine dynamics.
Possible computational interpretation of dynamic filtering of dopaminergic spikes
Extensive electrophysiological work in primates (Hollerman and Schultz, 1998; Schultz and Dickinson, 2000) has led to the theory that phasic activity of midbrain dopamine neurons encodes a prediction error signal in the estimation of future reward (Montague and Sejnowski, 1994; Montague et al., 1996; Schultz et al., 1997). Our findings show that the transformation of such spike activity (and presumably the prediction errors that they embody) into dopamine release adapts according to spike history. Through circuit-level adaptation, the average firing that encodes the prediction error is driven to zero, optimizing the prediction of the time and magnitude of rewarding events in the near future (Schultz et al., 1997). However, all real-world scenarios are associated with variability that no amount of learning can eliminate, that is, irreducible uncertainty. The long-lasting depression component will eventually drive dopamine transmission to zero for sustained spiking (Fig. 1B, arrow 3), thus providing an additional level of filtering at the dopamine terminal. The functional relevance of the short-lasting facilitation and depression components are less clear. There is evidence that short-term dynamics are necessary to account for human performance on specific repeated-play decision-making tasks (such as in Montague and Berns, 2002). In particular, subjects demonstrate a short-term memory for previous actions (a form of eligibility trace) that follows similar dynamics to those characterized here for dopamine release (R. Bogacz, S. McClure, J. Cohen, and R. Montague, personal communication). Based on these findings, it is tempting to speculate that the dynamic components that we have identified may be the physical substrate for the eligibility traces in reward-dependent decision tasks given to humans (Bogacz, McClure, Cohen, and Montague, personal communication). This is particularly provocative because dopamine release is strongly implicated in biasing action selection (Phillips et al., 2003b). Thus, dynamic adaptation of dopamine release has broad implications for multiple aspects of behavior.
Footnotes
This work was supported by the National Institute on Drug Abuse, National Institute of Mental Health, and Kane Family Foundation. We thank Drs. Regina Carelli, Peter Dayan, Ron Fisher, Margaret Rice, Donita Robinson, and Michael Wiest for constructive comments and criticisms.
Correspondence should be addressed to either of the following: P. Read Montague (model), Human Neuroimaging Laboratory, Center for Theoretical Neuroscience, Division of Neuroscience, Baylor College of Medicine, 1 Baylor Plaza, Houston, TX 77030, E-mail: read{at}bcm.tmc.edu; or Paul E. M. Phillips (experimental), Department of Psychology, Davie Hall CB 3270, University of North Carolina, Chapel Hill, NC 27599, E-mail: pemp{at}unc.edu.
S. M. McClure's present address: Department of Psychology, Princeton University, Princeton, NJ 08540.
Copyright © 2004 Society for Neuroscience 0270-6474/04/241754-06$15.00/0