Performance of instrumental actions in rats is initially sensitive to postconditioning changes in reward value, but after more extended training, behavior comes to be controlled by stimulus–response (S-R) habits that are no longer goal directed. To examine whether sensitization of dopaminergic systems leads to a more rapid transition from action–outcome processes to S-R habits, we examined performance of amphetamine-sensitized rats in an instrumental devaluation task. Animals were either sensitized (7 d, 2 mg/kg/d) before training (experiment 1) or sensitized between training and testing (experiment 2). Rats were trained to press a lever for a reward (three sessions) and were then given a test of goal sensitivity by devaluation of the instrumental outcome before testing in extinction. Control animals showed selective sensitivity to devaluation of the instrumental outcome. However, amphetamine sensitization administered before training caused the animals’ responding to persist despite the changed value of the reinforcer. This deficit resulted from an inability to use representations of the outcome to guide behavior, because a reacquisition test confirmed that all of the animals had acquired an aversion to the reinforcer. In experiment 2, post-training sensitization did not disrupt normal goal-directed behavior. These findings indicate that amphetamine sensitization leads to a rapid progression from goal-directed to habit-based responding but does not affect the performance of established goal-directed actions.
Studies of instrumental conditioning have established that, in rats, actions such as lever pressing are controlled by two dissociable associative processes. Early in acquisition, actions are mediated by goal-directed action–outcome (A-O) associations, requiring both the encoding of the instrumental contingency between the action and the specific outcome and a representation of the outcome as a goal. Thus, rats’ instrumental performance has been shown to be sensitive to postconditioning changes in reward value (Adams and Dickinson, 1981; Colwill and Rescorla, 1985; Dickinson and Balleine, 1994). As training proceeds, however, instrumental performance becomes habitual and, as a consequence, more independent of the current value of the goal (Adams, 1982; Dickinson et al., 1995). The development of habitual responding has been argued to reflect the increasing involvement of stimulus–response (S-R) associations and a decline in the contribution of A-O processes (Dickinson, 1985).
Recent research has provided evidence for a neural dissociation of these distinct learning processes involved in instrumental actions. Rats with lesions to the prelimbic medial prefrontal cortex display habit-based instrumental performance that is insensitive to outcome devaluation even with only limited amounts of training (Balleine and Dickinson, 1998; Killcross and Coutureau, 2003). Conversely, animals with lesions to the more ventral infralimbic cortex fail to develop habitual responding despite extended training (Coutureau and Killcross, 2003; Killcross and Coutureau, 2003). Similarly, lesions to the dorsolateral striatum preserve goal-directed behavior in overtrained rats, whereas lesions of the dorsomedial striatum disrupt the formation of A-O associations (Yin et al., 2004, 2005). These findings suggest a neural distinction between systems that control goal-directed actions and S-R habits. Overtraining of the instrumental response is one mechanism by which the balance between these two systems is altered (Adams, 1982).
There is good evidence that dopamine plays a role in the development of S-R habits. For example, in Parkinson’s disease, the degeneration of dopaminergic neurons in the substantia nigra is associated with impaired habit-learning in humans (Knowlton et al., 1996). Similarly, in rats, 6-OHDA lesions of the nigrostriatal dopamine system disrupt habit formation (Robbins et al., 1990; Faure et al., 2005), whereas posttraining intercaudate amphetamine injections accelerate S-R learning (Packard and White, 1991), and electrical stimulation of the substantia nigra is able to reinforce lever pressing (Reynolds et al., 2001). In the present study, we assessed the effect of sensitization of dopaminergic systems on the control of goal-directed behavior. Repeated exposure to psychostimulants has enduring behavioral consequences and induces long-term neural adaptations within brain areas that subserve learning and memory, including the mesostriatal dopamine system, the prefrontal cortex, and the amygdala (Vanderschuren and Kalivas, 2000; Everitt and Wolf, 2002; Robinson and Kolb, 2004).
The current experiments examined the effects of amphetamine pretreatment on animals’ ability to produce goal-directed actions, determining whether sensitization of dopaminergic systems accelerates the dominance of S-R habits. We compared the performance of amphetamine-exposed rats with vehicle control animals in a reinforcer devaluation task after limited training. To assess whether effects were mediated by learning or performance of the instrumental response, we measured sensitivity to outcome devaluation in animals exposed to amphetamine before training (experiment 1) or after training (experiment 2).
Materials and Methods
Experiments 1 and 2 consisted of pretraining and post-training amphetamine sensitization and instrumental devaluation by specific satiety and lithium chloride (LiCl)-induced nausea.
Thirty-two naive, male, hooded Lister rats (Harlan UK, Bicester, Oxon, UK) were used in experiment 1. At the beginning of the experiment, their mean ad libitum weight was 277 g (range, 255–323 g). The subjects in experiment 2 were 32 naive, male, hooded Lister rats with a mean ad libitum weight of 288 g (range, 275–318 g). Rats were housed in pairs in a climate-controlled vivarium (lights on 8:00 A.M. to 8:00 P.M.) and were tested during the light phase of the cycle. All experimental procedures involving animals and their care were performed in accordance with the United Kingdom Animals Scientific Procedures Act (1986) and were subject to Home Office approval (Project License PPL 30/2158).
d-Amphetamine sulfate (Sigma, Poole, UK) was dissolved in sterile PBS. PBS was also used for control vehicle injections. Doses of d-amphetamine sulfate, 2 mg/kg (sensitizing treatment) and 0.5 mg/kg (activity assay), were calculated as the salt.
The training apparatus comprised eight chambers (Paul Fray, Cambridge, UK) measuring 25 × 25 × 22 cm. The chambers were housed individually within sound-attenuating cabinets and ventilated by low-noise fans. Each chamber had three aluminum walls and a clear Perspex front wall. The roof was made of clear Perspex, and the floor consisted of 18 5-mm-diameter steel bars spaced 1.5 mm apart center to center, parallel to the back of the chamber. A recessed magazine that provided access to rewards via a hinged Plexiglas panel was located in the center of the left wall. The liquid rewards (0.1 ml) could be delivered into the magazine via a peristaltic pump. The reinforcers used were 20% w/v sucrose solution flavored with grape Kool-Aid (0.05% w/v) and 20% w/v maltodextrin solution flavored with cherry Kool-Aid (0.05% w/v) (Cybercandy, London, UK). Pilot studies indicated that, in normal rats, these reinforcers were well matched for motivational value but could be easily discriminated. Levers could be inserted to the left and right of the magazine. A house light (3 W) mounted in the roof provided general illumination. The apparatus and on-line data collection were controlled by means of an IBM-compatible microcomputer equipped with MED-PC software (Med Associates, St. Albans, VT)
Rats received intraperitoneal injections of 2 mg/kg d-amphetamine sulfate (amphetamine-sensitized group) or the equivalent volume of vehicle PBS (control group), once per day for 7 consecutive days. Rats were returned to their home cages immediately after each injection. Over a 7 d injection-free period, animals in experiment 1 were reduced to 80% of their ad libitum weight, before the start of behavioral training. One rat in experiment 1 died during sensitization treatment, so that 31 rats in total (15 vehicle controls and 16 amphetamine-exposed rats) proceeded to the training stage. Animals in experiment 2 were reduced to 80% of their ad libitum weight before undergoing behavioral training. On the completion of this training, the rats received the sensitization treatment, followed by a 7 d injection-free period before testing. Although there was a minor difference in the period of time between cessation of amphetamine injections and the start of devaluation testing between experiments 1 and 2 (14 and 11 d, respectively), this is unlikely to influence the assessment of sensitization, which has been shown to have profound behavioral effects across days, weeks, and even months (Vanderschuren and Kalivas, 2000).
In experiment 1, after the sensitization procedure, each animal was assigned to one of the eight conditioning chambers and thereafter was always trained in that chamber. At the start of each session, the house light came on and remained on throughout the session. The house light went out at the end of each session. Training consisted of two stages: magazine training and lever-press training. This was followed by extinction tests after devaluation by specific satiety and LiCl-induced nausea. In experiment 2, animals received training before the sensitization treatment and received extinction tests after devaluation by LiCl-induced nausea. The key stages of the experimental design for both experiments are summarized in Table 1.
All rats were trained to collect food rewards during two 30 min magazine training sessions. One-half of the animals were trained to collect the sucrose solution, and the other half were trained to collect the maltodextrin solution (counterbalanced across treatment and devaluation groups). The rewards were delivered on a random-time (RT) 60 s schedule by which rewards were delivered, on average, every 60 s.
The rats were trained initially to lever press during two sessions on a continuous schedule of reinforcement, with each press producing reward. One lever was inserted into the chamber at the beginning of the session and retracted at the end of the session. Each session continued until the rat had earned 25 reinforcers. In the next three sessions of training, rewards were delivered according to a random-interval (RI) 30 s schedule (reward available, on average, every 30 s and delivered after the next lever press). Because current evidence indicates that the critical determinant of sensitivity to outcome devaluation is the degree of exposure to the reinforcer rather than the number of responses made, the number of reinforcers earned in these sessions was strictly controlled (Adams, 1982). In each session, animals earned a total of 40 reinforcers. Thereby at the end of training, animals had earned a total of 120 rewards on this schedule. Although previous work has demonstrated that interval schedules are less sensitive to outcome devaluation than ratio schedules (Dickinson et al., 1983), previous research, as well as pilot studies, has shown that this low level of training is sufficient to produce stable rates of responding but maintains sensitivity to outcome devaluation in normal animals even when a RI schedule is used (Dickinson et al., 1995).
Because we planned to test the animals’ sensitivity to outcome devaluation and to ensure that the nondevalued group readily consumed an alternative reinforcer in prefeeding sessions, we equated the animals’ exposure to two reinforcers. In addition to the reinforcers earned in lever-press sessions, each rat received three sessions during which 40 presentations of the alternative reinforcer were made on a RT 30 s schedule. One-half of the animals were exposed to the alternative reinforcer in the afternoon after morning lever-press training, and the other half received alternative reinforcer sessions in the morning before lever-press training in the afternoon (counterbalanced across treatment and devaluation groups).
Devaluation by specific satiety (experiment 1 only)
All animals then received one session of devaluation by specific satiety, followed by an extinction test during which lever presses and magazine entry behavior were assessed. Animals were placed in feeding cages and given ad libitum access for 1 h to either the instrumental outcome (devalued group) or the alternative reinforcer (nondevalued group). Immediately after this prefeeding session, the animals were transferred to the conditioning chambers and received an 8 min extinction test in the absence of reward delivery. The lever was present during this session, but no reinforcers were delivered.
Devaluation by LiCl (experiments 1 and 2)
Because the animals had been through the extinction test above, rats in experiment 1 received a reminder session on the day after the test. Animals were given one session in which they lever pressed to earn a total of 40 rewards. The reminder session was identical to the initial sessions of instrumental training.
Thereafter, animals received 3 d of devaluation with LiCl. On each day, the rats were placed in the operant chambers and were given 40 free presentations of either the instrumental outcome (devalued group) or the alternative reinforcer (nondevalued group) on an RT 30 s schedule. Immediately after the cessation of each session, the devalued group received a 0.15 m, 10 ml/kg intraperitoneal injection of LiCl solution (Sigma), and the nondevalued group received an injection of the equivalent volume of saline. Twenty-four hours after the final session of taste aversion training, the animals’ sensitivity to outcome devaluation was assessed in an additional 8 min extinction test in the absence of reward delivery. This was conducted as described above.
To demonstrate that the devalued group had acquired an aversion to the instrumental outcome, all rats underwent a 15 min reacquisition test. The animals were placed in the conditioning chambers, and the lever was pressed to earn the instrumental outcome on an RI 30 s schedule.
To confirm sensitization, all animals were administered a 0.5 mg/kg intraperitoneal amphetamine challenge before assessment of levels of locomotor activity. These tests occurred immediately after the reacquisition tests. Activity was monitored using eight chambers (56 cm wide × 39 cm deep × 19 cm high). Activity within each chamber was recorded with pairs of photobeams situated 20 cm apart and 18 cm from the end of the cage connected to a control box (Paul Fray). Each beam break resulted in an incremental count for that chamber and was recorded by an Acorn computer programmed in BBC Basic. Locomotor activity was measured (total number of photobeam breaks) for 30 min.
Statistical analysis was performed using ANOVA with between-subject factors of devaluation (devalued vs nondevalued) and sensitization treatment (either sensitized or vehicle controls). Because the SD was proportional to the mean, the extinction data were subject to logarithmic transformations (Howell, 2002). All ANOVAs use an α level of p < 0.05 for the rejection of the null hypothesis.
Experiment 1: pretraining amphetamine sensitization and instrumental devaluation by specific satiety
All of the rats acquired the initial instrumental response at the same rate (data not shown). Significantly, by the end of the 3 d of training, there were no differences in baseline responding as a result of pretreatment with amphetamine (mean responses per minute: vehicle group, 9.6; amphetamine group, 9.8). Similarly, there was no effect of devaluation group (mean responses per minute: to-be-devalued group, 10.3; to-be-valued group, 9.1). This was confirmed by ANOVA, which revealed no effect of amphetamine treatment (F < 1) and devaluation group (F(1,27) < 1.6; p = 0.221) and no interaction between these two factors (F < 1) and suggests that baseline differences are unlikely to account for any effects of amphetamine in subsequent extinction tests.
Lever-press extinction test performance
The mean response rates per minute as a proportion of baseline (which did not differ; see above for details) for the 8 min of the extinction test are presented in the left panel of Figure 1. This suggests that the vehicle-pretreated animals’ lever-press performance was sensitive to the current value of the goal. Thus, the vehicle-injected control group performed fewer lever presses as a proportion of their baseline rates after prefeeding with the instrumental outcome (devalued, white bars) compared with those prefed the alternative reinforcer (nondevalued, gray bars). Conversely, the performance of the amphetamine-sensitized animals was not goal directed as demonstrated by their failure to show sensitivity to the change in reward value. The devalued group pressed the lever at an equivalent rate to the nondevalued group, suggesting that their responding was insensitive to goal value and habitual.
This description of the data was confirmed by statistical analysis. ANOVA yielded no effect of devaluation (F(1,27) = 2.99; p = 0.095) or treatment (F < 1) but critically a significant treatment-by-devaluation interaction (F(1,27) = 4.226; p < 0.05). Simple-effects analysis revealed that devalued and nondevalued performance differed in the vehicle-injected control group (F(1,13) = 6.917; p < 0.05) but not in the amphetamine-sensitized animals (F < 1).
Magazine entry extinction test performance
The right panel of Figure 1 shows magazine entry behavior during the extinction test. Preliminary analysis revealed no effect of treatment on baseline levels of magazine entry (F < 1; mean responses per minute: vehicle group, 5.8; amphetamine group, 6.3) and therefore data from the test were expressed as a proportion of baseline responding. This figure indicates that prefeeding produced a decrease in magazine entry behavior in both treatment groups but that this effect was more marked in the vehicle controls. Statistically, ANOVA revealed only a main effect of devaluation (F(1,27) = 4.959; p < 0.05) but no effect of treatment (F < 1) or an interaction (F(1,27) = 1.104; p = 0.303).
Experiment 1: pretraining amphetamine sensitization and instrumental devaluation by LiCl
In the test above, amphetamine-sensitized animals prefed the instrumental outcome showed no devaluation effect, indicating that responding was habitual rather than goal directed. In this respect, the amphetamine-sensitized rats’ lever-press performance mirrors that of overtrained rats (Adams, 1982) or the responding of animals with lesions to the prelimbic cortex (Balleine and Dickinson, 1998; Killcross and Coutureau, 2003). However, devaluation procedures do impact on magazine approach behavior in rats whose lever-press performance is habitual (Killcross and Coutureau, 2003). Although there was an effect of devaluation on magazine entry behavior in the amphetamine-sensitized animals reported here, it is evident from Figure 1 that this effect was small. It is known that sensitization with psychostimulants such as amphetamine leads to increased salience attribution to rewards and associated cues (Berridge and Robinson, 1998) and causes increased “wanting” rather than “liking” for associated rewards (Wyvell and Berridge, 2001). Thus, the lack of a devaluation effect in animals pretreated with amphetamine shown above may not have been the result of the accelerated learning of S-R associations, but rather attributable to the failure of the prefeeding procedure to devalue the outcome sufficiently. Conditioned taste aversion, induced by LiCl, produces far more robust devaluation effects in normal animals compared with prefeeding with the instrumental outcome. In this test, we reassessed the animals’ sensitivity to changes in goal value, but after pairing the reward with LiCl-induced illness. Furthermore, the level of aversion to the reinforcer could be assessed in a subsequent reacquisition test.
As with the initial instrumental training, amphetamine-sensitized rats pressed at comparable rates to the vehicle controls (F < 1; mean responses per minute: vehicle group, 9.3; amphetamine group, 9.7). Similarly, the to-be-devalued group did not differ from the to-be-valued group (F < 1; mean responses per minute: to-be-devalued group, 10.1; to-be-valued group, 9.0).
Lever-press extinction test performance
The left panel of Figure 2 displays the instrumental performance during the 8 min extinction test for the vehicle-injected control group and the amphetamine-sensitized rats as a proportion of their baseline responding. In the vehicle-injected control groups, test performance showed a marked reduction in responding after conditioned aversion training (devalued, white bars) relative to animals that had not been averted from that outcome (nondevalued, gray bars). In contrast, lever-press performance of the amphetamine-pretreated group seemed to be impervious to the change in the goal value. The amphetamine-sensitized rats averted from the instrumental reward showed comparable levels of responding to that of sensitized rats not averted from the reinforcer.
ANOVA with treatment and devaluation as factors supported this observation. There was a main effect of treatment (F(1,27) = 9.71; p < 0.01) and of devaluation (F(1,27) = 11.627; p < 0.01) but, crucially, also a highly significant interaction between these two factors (F(1,27) = 8.223, p < 0.01). Simple-effects analysis of this interaction confirmed that the devalued vehicle-injected group showed a marked suppression in lever-press responding compared with the nondevalued vehicle-injected animals (F(1,13) = 19.0486; p < 0.001) but that there was no effect of devaluation in amphetamine-sensitized rats (F < 1). Further simple-effects analysis revealed an effect of amphetamine treatment in the devalued groups (F(1,13) = 17.305; p < 0.01) but not in the nondevalued groups (F < 1).
Magazine entry extinction test performance
The mean magazine entries per minute, as a proportion of baseline, during the extinction test after taste aversion training are shown in the right panel of Figure 2. This figure shows that the animals with an aversion to the reinforcer performed considerably fewer magazine entries compared with the nondevalued controls. This was confirmed by ANOVA that yielded a highly significant main effect of devaluation (F(1,27) = 41.571; p < 0.001). It is also clear from Figure 2 that there were overall higher levels of magazine entry behavior in the amphetamine-sensitized rats compared with the vehicle controls: ANOVA revealed a main effect of treatment (F(1,27) = 11.533; p < 0.01). However, the extent of the devaluation effect in the sensitized animals was equivalent to that seen in the vehicle-injected control animals, as demonstrated by the lack of a treatment-by-devaluation interaction (F < 1). Hence, sensitization with amphetamine did not influence the ability of LiCl to produce a devaluation of magazine entry behavior. This contrasts with the effects of LiCl devaluation on instrumental responding (see above) and suggests that magazine entry behavior and lever-press performance may be subserved by dissociable neural and psychological processes.
Reacquisition test: lever-press performance
The results of the reacquisition test confirmed that the LiCl injections had successfully devalued the instrumental outcome in both drug treatment groups. The mean lever presses per minute for the rewarded reacquisition test are presented in the left panel of Figure 3. This indicates that, compared with the nondevalued control group, the devalued group performed considerably fewer lever presses in the 15 min rewarded test. Statistical analysis by ANOVA produced a highly significant main effect of devaluation (F(1,27) = 89.748; p < 0.001). The trend toward higher levels of responding in the amphetamine-sensitized group was maintained in the reacquisition test (F(1,27) = 14.703; p < 0.01), but again the level of devaluation in these animals was comparable to that seen in the drug-naive rats because there was no treatment-by-devaluation interaction (F < 1). These results stand in marked contrast to the results of the extinction test and, like the results for magazine approach above, indicate that the devaluation procedure was just as effective in sensitized animals as it was in the control group. This contrast between extinction and reacquisition tests also highlights an important feature of devaluation experiments: whereas extinction tests can provide evidence for the strength of reward expectation in goal-directed responding, lowered performance in reacquisition tests can reflect either the devaluation of goal-directed actions, or the direct punishment of S-R associations by the presentation of the nausea-inducing outcome.
Reacquisition test: magazine entry behavior
The effectiveness of the LiCl treatment in devaluing the instrumental reward is also supported by analysis of magazine entry behavior in the reacquisition test. The mean responses per minute were as follows: devalued vehicle group, 0.5; nondevalued vehicle group, 5.6; devalued amphetamine group, 2.8; nondevalued amphetamine group, 7.9. Both devalued groups showed a marked suppression in magazine entries compared with the nondevalued groups. ANOVA revealed a main effect of devaluation (F(1,27) = 35.117; p < 0.001). The amphetamine-sensitized animals again displayed higher levels of magazine entry behavior (F(1,27) = 6.877; p < 0.05), but this heightened activity did not influence the level of devaluation in these animals (no treatment-by-devaluation interaction; F < 1).
After the completion of behavioral training, rats in both the vehicle- and amphetamine-treated groups received a 0.5 mg/kg amphetamine challenge immediately before assessment of locomotor activity to confirm the presence of psychomotor sensitization. In the first 15 min of the activity assay, amphetamine-treated rats showed enhanced locomotor activity (mean total photobeam breaks, 314.5) relative to the vehicle-treated controls (mean total photobeam breaks, 230.1). ANOVA yielded a significant main effect of treatment (F(1,29) = 4.585; p < 0.05), confirming that amphetamine pretreatment had successfully sensitized these animals. We also examined correlations between the locomotor activity in response to the amphetamine challenge in the devalued sensitized animals and lever-press performance in the two extinction tests. Locomotor activity in the sensitization assay bore no relationship either to performance after devaluation by specific satiety (r = 0.1; p = 0.813) or after devaluation by LiCl (r = 0.034; p = 0.936). Hence, the failure to detect sensitivity to outcome devaluation in these animals is unlikely to be explained simply in terms of increased locomotor activity.
Experiment 2: post-training amphetamine sensitization and instrumental devaluation
In experiment 1, we demonstrated that animals that had been sensitized with amphetamine failed to alter their lever-press performance in response to a change in the value of the reinforcer, brought about by prefeeding with the instrumental outcome and pairing the reward with illness. These results indicate that, in sensitized animals, instrumental responding was not goal directed but rather stimulus driven and habitual. However, whether the effect of sensitization was one on learning or performance of the response was confounded in this experiment. Sensitizing the animals after the initial training would allow these two possibilities to be dissociated. In experiment 2, therefore, the sensitization treatment was conducted after initial lever-press training. After 1 week of recovery, the sensitivity of rats to outcome devaluation was assessed by pairing the reward with LiCl before an extinction test. If the animals’ responding in this test is independent of the current value of the goal, it would suggest that the sensitization treatment had had an effect on the performance or expression of the instrumental action. If the effect of sensitization is restricted to the acquisition phase of instrumental learning, we would expect the animals to continue to show sensitivity to the changed value of the reinforcer.
Training proceeded smoothly with all animals acquiring the instrumental response at the same rate (data not shown). By the end of training, there were no differences in lever-press responding between animals allocated to the amphetamine sensitization group and vehicle-injected control group (F < 1; mean responses per minute: vehicle group, 11.9; amphetamine group, 11.9). Moreover, there was no effect of devaluation group (F(1,28) = 1.321; p = 0.26; mean responses per minute: to-be-devalued group, 11.5; to-be-valued group, 12.3) nor an interaction between treatment and devaluation allocation (F(1,28) < 2.193; p = 0.15).
Lever-press extinction test performance
The mean lever presses per minute as a proportion of baseline for the 8 min of the extinction test are shown in Figure 4. This suggests that, regardless of drug treatment, all animals with the devalued reinforcer (white bars) showed a marked reduction in lever pressing relative to the nondevalued group (gray bars). This description of the data was confirmed statistically by ANOVA with between-subjects factors of sensitization treatment and devaluation group. The post-training amphetamine treatment had no effect on the animals’ sensitivity to outcome devaluation, because there was a highly significant main effect of devaluation (F(1,28) = 43.101; p < 0.001) but no interaction (F <1). There was a trend for overall higher responding in the amphetamine-sensitized group, but it failed to reach the level of rejection of the null hypothesis (F(1,28) = 2.89; p = 0.1).
Magazine entry extinction test performance
The success of the LiCl treatment in devaluing the outcome for both groups is also highlighted by analysis of magazine entry behavior during the extinction test. As is clear from Figure 4, both vehicle-injected control groups (mean magazine entries per minute as a proportion of baseline: devalued vehicle group, 0.3; devalued amphetamine group, 0.5) showed a clear suppression in magazine activity compared with the nondevalued groups (nondevalued vehicle group, 0.7; nondevalued amphetamine group, 0.9). ANOVA yielded a highly significant main effect of devaluation (F(1,28) = 21.446; p < 0.001). Although there was a marginally significant trend toward higher magazine activity in the amphetamine-sensitized animals (F(1,28) = 4.172; p = 0.051), this failed to impact on the level of devaluation in these animals because there was no treatment-by-devaluation interaction (F < 1).
Reacquisition test: lever-press performance
The results of the reacquisition test, shown in Figure 3, confirmed that both treatment groups in the devalued condition had acquired a strong aversion to the reinforcer. Relative to the nondevalued controls, devalued rats pressed the lever at a lower rate. This observation was supported by statistical analysis. ANOVA revealed a main effect of devaluation (F(1,28) = 66.386; p < 0.001) and also a main effect of drug (F(1,28) = 17.121; p < 0.001), reflecting overall higher response rates in the amphetamine-sensitized animals. However, the higher level of responding in the amphetamine-sensitized rats did not influence the magnitude of the devaluation effect in these animals relative to vehicle controls, because there was no treatment-by-devaluation interaction (F < 1).
The results of the activity test confirmed that the pretreatment with amphetamine had successfully sensitized the rats. In response to the 0.5 mg/kg amphetamine challenge, the amphetamine-pretreated animals displayed heightened locomotor activity during the first 15 min of the 30 min assay (mean total photobeam breaks, 524.7) compared with the vehicle controls (mean photobeam breaks, 386.6). ANOVA revealed a main effect of treatment (F(1,28) = 21.428; p < 0.001). This enhanced activity, moreover, did not correlate with test lever-press performance (r = 0.016; p = 0.969).
These experiments investigated the effects of amphetamine pretreatment on the sensitivity of lever pressing to reward devaluation after limited training. Consistent with previous accounts, vehicle-injected control animals showed a selective suppression in lever-press performance after reinforcer devaluation by either specific satiety or LiCl-induced nausea. However, pretraining exposure to amphetamine disrupted acquisition of goal-directed behavior. Sensitized rats failed to modify their lever-press performance in response to the changed value of the outcome, responding at equivalent levels to those seen in nondevalued controls. This effect was observed after devaluation by both specific satiety and LiCl-induced nausea and suggests that the control of responding in the amphetamine-treated rats was not dependent on the expected outcome but instead was dominated by reflexive habits. The studies also revealed a dissociation between the effect of pretraining and post-training sensitization on sensitivity to outcome devaluation. Experiment 2 demonstrated that animals exposed to amphetamine after initial lever-press training retained robust sensitivity to changes in reward value, indicating that amphetamine treatment disrupts the acquisition, but not expression, of goal-directed actions.
Several aspects of the current data deserve comment. The failure to detect sensitivity to outcome devaluation after amphetamine exposure in experiment 1 cannot be accounted for in terms of a general learning impairment. Amphetamine-sensitized animals acquired the instrumental response at equivalent rates to the vehicle-injected control group. Moreover, these animals showed extinction at rates comparable to those seen in the nondevalued, vehicle-pretreated animals. There is evidence that antagonism of dopaminergic systems by neuroleptics produces response patterns that resemble extinction (Phillips and Fibiger; 1979; Gray and Wise, 1980), whereas amphetamine (Fletcher, 1995, 1996; Foltin, 2004) and the D2 agonist quinpirole (Kurylo and Tanguay, 2003) have been shown to attenuate extinction. However, in neither experiment 1 nor experiment 2 did amphetamine sensitization have any discernible effect on the rate of extinction with the experimental parameters we used. Hence, it is unlikely that the failure to detect sensitivity to outcome devaluation in experiment 1 can be explained in terms of alterations in extinction processes at test by sensitization. Indeed, if this were the case, we would have expected to see equivalent results in experiments 1 and 2, in contrast to the actual findings.
The locomotor-activating effects of psychostimulants are also well documented (Stewart and Badiani, 1993). This was confirmed in the current experiments by heightened locomotor activity in response to an amphetamine challenge compared with drug-naive performance. However, this increase in locomotor activity did not correlate with performance in the devaluation extinction tests. Similarly, the observation that magazine entry behavior remained sensitive to changes in reward value, as well as the devaluation effect seen in the reacquisition test, confirm that the performance of amphetamine-sensitized animals was not simply a consequence of hyperactivity: they were able to suppress specific response tendencies in certain situations and were not impaired in their general ability to inhibit responding. Thus, the results cannot be accounted for in terms of enhanced locomotor activity or general response perseveration.
Nor did amphetamine sensitization change the motivational and incentive impact of the devaluation treatments used (Wyvell and Berridge, 2001). It is clear from the magazine entry data and the reacquisition test that insensitivity to outcome devaluation was not attributable to any ineffectiveness of the prefeeding treatment or a failure to acquire an aversion to the reinforcer after taste aversion training. In experiment 1, although lever-press performance in extinction was impervious to the shift in the value of the reinforcer, magazine entry behavior remained sensitive to manipulations in goal value, and the reacquisition test confirmed that all animals had acquired an aversion to the reinforcer. Similarly, because any explanation in terms of changes in the effectiveness of reward devaluation depends on effects restricted to the test phase of the experiment, then this effect should also be observed when sensitization occurred after training. Rather, in experiment 2, amphetamine-pretreated rats were as sensitive to the changed value of the reinforcer as vehicle-injected control animals. Hence, the insensitivity to outcome devaluation observed in the two extinction tests in experiment 1 can only be explained in terms of a failure to integrate knowledge about the changed value of the reinforcer with current actions rather than any differential impact of manipulations of reward value in sensitized animals.
An additional notable feature of the current data was the dissociation between magazine approach and instrumental lever pressing seen in the current experiments. We found that magazine entry behavior in an operant procedure remained sensitive to outcome devaluation after amphetamine sensitization, a finding entirely consistent with previous reports indicating that magazine entry behavior is under different psychological and neural control to the performance of lever pressing (Holland, 1979, 1998; Dickinson et al., 2000; Corbit et al., 2001; Killcross and Coutureau, 2003). A consideration of the proximity of these response classes to reward delivery may provide a possible reason for this dissociation. In line with current findings, evidence suggests (Balleine et al., 1995) that responses proximal to the goal (such as magazine entry) remain more sensitive to motivational shifts and devaluation procedures than responses more distal to reward (such as lever pressing). This has been characterized as reflecting the development of chains of responses in which only the terminal actions (those proximal to the goal) form a direct association with the reward, and actions earlier in the behavioral chain (and hence more distal to the goal) instead exert discriminative control over the production of more proximal actions (Balleine et al., 1995; Killcross and Blundell, 2002). Alternatively, there may be greater control of magazine approach responses by pavlovian, as opposed to instrumental, contingencies (Balleine et al., 1995). In instrumental learning, reward delivery is explicitly contingent on specific responses such as lever presses, whereas in pavlovian conditioning, reward presentation is contingent on the presence of cues in the environment; these cues can come to elicit responses such as magazine approach (for example, by activation of a representation of reward), but reward delivery is strictly independent of such responses [see Killcross and Blundell (2002) for further discussion].
A recent study has reported a disruption of the effects of outcome devaluation on the pavlovian-conditioned magazine approach after cocaine sensitization (Schoenbaum and Setlow, 2005), and another has indicated that pavlovian-conditioned magazine approach shifts from being dopamine D1 receptor mediated to D1 independent with prolonged training (Choi et al., 2005). However, the conditioned magazine approach to a pavlovian cue remains behaviorally sensitive to reward devaluation regardless of overtraining (Holland, 1998); instrumental lever pressing does not. Hence, neither study of pavlovian conditioning is examining a behavioral response in which habitual performance as a consequence of overtraining leads to insensitivity to reward devaluation in normal animals. Overtraining produces a conditioned magazine approach that is insensitive to D1 manipulations, but this is unrelated to changes in reward sensitivity (because these are not a consequence of overtraining in normal animals). Similarly, cocaine sensitization renders magazine approach insensitive to reward devaluation (but this does not mimic the response of normal overtrained animals), but a novel insensitivity to goal value. This novel insensitivity has not been observed in magazine approach during instrumental lever pressing in previous behavioral (Balleine, 1992; Dickinson and Balleine, 1993) or neural (Killcross and Coutureau, 2003) studies or here after amphetamine sensitization. Rather, in our study, amphetamine sensitization mimics the effects of overtraining on lever pressing and magazine approach that are observed in normal animals.
Dopamine and habit learning
The finding that simple exposure to amphetamine renders instrumental responding insensitive to outcome devaluation concurs with recent evidence for a neural dissociation between a goal-directed action system involving the prelimbic prefrontal cortex and the dorsomedial striatum (Killcross and Coutureau, 2003; Yin et al., 2005), among other regions (Corbit et al., 2001, 2003, Balleine et al., 2003), and a habit system involving the infralimbic prefrontal cortex and the dorsolateral striatum (Mishkin et al., 1984; Reading et al., 1991; Jog et al., 1999; Packard and Knowlton, 2002; Coutureau and Killcross, 2003; Yin et al., 2004). The present results also support the involvement of dopaminergic tone in the process whereby S-R habits come to dominate instrumental performance (Canales, 2005). Evidence suggests that there are profound effects of psychostimulant sensitization in dorsal striatal terminal regions (Di Chiara and Imperato, 1988; Barrot et al., 1999; Canales and Graybiel, 2000; Ito et al., 2002). It has been demonstrated that amphetamine sensitization brings about differential changes in the responsiveness of neurons in matrix and striosome compartments of the striatum; activity in matrix neurons is reduced, leading to preferential activation of the striosomal system (Canales et al., 2002). After recent suggestions (Canales, 2005), this shift in activity patterns may represent the normal shift in neural activation during the transition between goal-directed and habitual behavior, a process that is facilitated by sensitization.
The current findings have important implications for our understanding of the control of voluntary, goal-directed behaviors and reflexive, stimulus-bound habitual responding. They support previous evidence for a dissociation of neural systems that subserve goal-directed actions and habits and implicate dopaminergic tone in the dominance of S-R associations and hence the progression from goal-sensitive to goal-independent behavior. More generally, the current findings underscore the significance of dopamine in learning and reward (Waelti et al., 2001; Wise, 2004). Imbalances to this system may be associated with certain human psychopathologies. For example, the demonstration here that amphetamine is able to subvert the habit system may provide one possible mechanism underpinning the transition from acute drug abuse to chronic addiction. Sensitization of dopaminergic systems, resulting in the promotion of S-R processes and the concomitant increase in the control of behavior by reward-related cues, may contribute to the development of compulsive drug taking (Robbins and Everitt, 1999; Everitt et al., 2001). Dysfunction in these systems may also relate to other neuropsychiatric conditions such as obsessive-compulsive disorder and Tourette’s syndrome, which are characterized by involuntary, reflexive, and repetitive behaviors (Graybiel and Rauch, 2000; Leckman and Riddle, 2000).
This work was supported by a Biotechnology and Biological Sciences Research Council (BBSRC) grant to S.K. and a BBSRC studentship award to A.N.
- Correspondence should be addressed to Dr. Simon Killcross, School of Psychology, Tower Building, Park Place, Cardiff University, Cardiff CF10 3AT, UK. Email: