 |
Previous Article
The Journal of Neuroscience, December 1, 2000, 20(23):8954-8964
The Effect of Lesions of the Insular Cortex on Instrumental
Conditioning: Evidence for a Role in Incentive Memory
Bernard W.
Balleine1 and
Anthony
Dickinson2
1 Department of Psychology, University of California,
Los Angeles, Los Angeles, California 90095, and
2 Department of Experimental Psychology, University of
Cambridge, Cambridge CB2 3EB, United Kingdom
 |
ABSTRACT |
In three experiments, we assessed the effect of lesions aimed at
the gustatory region of the insular cortex on instrumental conditioning
in rats. In experiment 1, the lesion had no effect on the
acquisition of either lever pressing or chain pulling in food-deprived
rats whether these actions earned food pellets or a maltodextrin
solution. The lesion did, however, attenuate the impact of outcome
devaluation, induced by sensory-specific satiety, on instrumental
performance but only when assessed in an extinction test. This effect
was not secondary to an impairment in instrumental learning; in
experiment 2, no evidence was found to suggest that the lesioned rats
differed from shams in their ability to encode the specific
action-outcome contingencies to which they were exposed during
training. In experiment 3, however, lesioned rats were found to be
insensitive to the impact of an incentive learning treatment conducted
when they were undeprived; although, again, this deficit was confined
to a test conducted in extinction. These results are consistent with
the view that, in instrumental conditioning, the gustatory region of
the insular cortex is involved in encoding the taste of food outcomes
in memory and, hence, in encoding the incentive value assigned to these
outcomes on the basis of prevailing motivational conditions.
Key words:
instrumental conditioning; gustatory cortex; insular
cortex; devaluation; sensory-specific satiety; incentive learning; contingency; motivation; reward
 |
INTRODUCTION |
Although the brain mechanisms of
reward have long been the subject of intensive investigation (for
review, see Robbins and Everitt, 1996 ), few attempts have been made to
specify how reward-related processes make contact with responses
instrumental to the delivery of rewarding events. Indeed, to the extent
that the structure of instrumental conditioning has been specified in
the neurobiological literature, it has typically been based on the
classic stimulus-response (S-R)/reinforcement system originally
advanced within Thorndike's (1911) "law of effect" (Donahoe
et al., 1993 ). There is, however, considerable evidence against
this suggestion coming primarily from studies assessing the impact of
post-training changes in the reward value of the instrumental outcome
on subsequent performance. In one study (Colwill and Rescorla, 1985 ),
for example, hungry rats were trained to lever press and chain pull
with one response earning food pellets and the other sucrose solution.
After this training, one outcome was devalued using a sensory-specific
satiety treatment; i.e., rats were allowed to consume one of the two
outcomes, pellets or sucrose, for 1 hr, before a choice extinction test conducted on the levers and chains. Although S-R theories predict that
this satiety treatment should not produce any differential effect on
performance, Colwill and Rescorla (1985) found that, on test, their
rats performed fewer of the response and that, in training, had
delivered the outcome on which they were sated before the test than of
the other response.
A similar devaluation effect has also been reported in rats trained on
two levers that both deliver a nutritive outcome differing only with
respect to a single taste feature (Balleine and Dickinson, 1998 , their
Experiment 1), suggesting that, in hungry rats, specific satiety-induced outcome devaluation affects instrumental performance through a change in the attractiveness of taste features of the food
outcome. This finding is important because it suggests that processes
involved in the detection and representation of taste may be critically
involved in encoding both the consequences of an action and changes in
the incentive value of those consequences in hungry rats.
From this perspective, recent suggestions that the cortical targets of
brainstem visceral and taste centers form a distributed memory system
encoding biologically potent sensory events are of considerable
interest (Braun, 1989 ; Rolls, 1994 ). Within this system, the ability of
animals to encode the specific taste features of biologically potent
events appears to depend on a region of the insular cortex centered on
the median artery above the rhinal sulcus, referred to as the gustatory
cortex (GC) (Braun et al., 1982 ; Kosar et al., 1986a ,b ). The best
evidence for this claim comes from the finding that damage to this
region strongly attenuates the impact of a conditioned taste aversion
treatment on the incentive value of foods and fluids (Braun et
al., 1982 ; Braun, 1989 ). This effect on taste aversion learning
is unlikely to be attributable to changes in the ability of
animals to detect the taste features of foods and fluids themselves;
taste thresholds for each of the primary tastes appear normal in
GC-lesioned rats and do not differ from unoperated control groups
(Braun et al., 1982 ), nor does the sensitivity of GC-lesioned rats to
the effects of emetic treatments appear to be modified because they
display a normal ability to develop potentiated aversions to odors
(Kiefer et al., 1984 ). Therefore, Braun (1990) proposed that GC lesions
induce a form of taste agnosia, hypothesizing that the GC provides
animals with the ability to recall taste features over the delay
between taste detection during consumption and detection of emesis (or
other post-ingestive effects), thereby allowing the formation of an association between taste and illness.
Given the evidence that taste features of the instrumental outcome play
a critical role in changes in incentive value induced by outcome
devaluation, the "taste memory" hypothesis of GC function predicts
that lesions of this structure should attenuate the impact of outcome
devaluation by sensory-specific satiety on instrumental performance in
hungry rats. Furthermore, this view specifies that the source of any
deficit in sensitivity to outcome devaluation should lie in the
inability of lesioned rats to recall the devalued taste and not in the
detection of the primary taste itself or in the assignment of incentive
value to the taste. Consequently, any effect of GC lesions on
devaluation should be confined to a situation in which the rat is
forced to recall changes in outcome value to formulate a course of
action. The aim of this series of experiments was to investigate this
claim by characterizing the effect of cell-body lesions of the GC on
instrumental conditioning.
 |
MATERIALS AND METHODS |
Experiment 1
Subjects and apparatus
Subjects were 20 male Hooded Lister rats (Harlan Olac, Bicester
UK). They were housed in squads of four in a temperature- and
humidity-controlled room with lights on from 8:00 A.M. to 8:00
P.M. Upon delivery, rats were maintained with access to food and
water ad libitum. They were then operated on, at which point they weighed at least 300 gm. After surgery, the animals were returned
to their home cages for a period of 10 d recuperation. At this
point and for all stages of behavioral testing, the animals were
shifted to a 22.5 hr food deprivation schedule, under which they
received access to food for 1.5 hr each day in their home cage at least
2 hr after behavioral testing with water available ad
libitum.
Instrumental training and testing was conducted in four Campden
Instruments (Manchester, UK) operant chambers. Each chamber was
equipped with a recessed magazine, a retractable lever, and a chain.
The magazine was positioned in the center of the front wall and could
be entered via a flap door, which was attached to a microswitch. The
lever and chain (which was lowered through the ceiling from a
microswitch) were positioned symmetrically to the right and left side
of the magazine flap, respectively. The chambers were also fitted with
a pellet dispenser and a peristaltic pump, both of which were
programmed to deliver the instrumental outcomes into the recessed
magazine. The outcomes used were a 45 mg Noyes pellet (formula A) and
0.05 ml of a 20% solution of maltodextrin (Cerestar Ltd., Manchester,
UK). Each chamber was illuminated by a 3 W house light mounted in the
center of the front panel above the magazine. A BBC microcomputer
equipped with the SPIDER extension for on-line control (Paul Fray Ltd.,
Cambridge, UK) controlled the equipment and recorded lever presses and
chain pulls. For the presentation of the outcomes outside the operant chambers, eight feeding cages were used. These were molded plastic boxes, 30 × 13 × 11 cm in size, with wire mesh ceilings.
Pellets were given in small glass dishes placed inside these cages,
whereas maltodextrin was given through calibrated drinking tubes
inserted through a hole in the wire mesh ceiling.
Surgical procedures
Animals were anesthetized using a barbiturate-alcohol
preparation (0.3 ml/100 gm). After being marked for identification and shaved, they were placed in the stereotaxic frame (David Kopf Instruments, Tujunga, CA), and an incision was made into the scalp to
expose the skull. The incisor bar was then adjusted to the level-head
position. Rats in group GC (n = 10) received
intracranial injections of 1.0 µl of 0.09 M
quinolinic acid dissolved in PBS at the following coordinate
sites: anteroposterior (AP), +1.2; mediolateral (ML), ±3.0; and
dorsoventral, 5.0 with the infusion needle lowered at an angle
of 20° in the ML plane. One injection was given on the left side and
a second on the right side of the midline, using level-head coordinates
derived from the stereotaxic atlas of Paxinos and Watson (1986) . The
injections were made using a 10 µl Hamilton syringe through a 30 gauge injection cannula, which was glued into a 23 gauge sleeve for
support. The toxin was infused at the rate of 0.33 µl/min. The
cannula was then left in place for 3 min to allow diffusion of the
toxin away from the cannula tip before being raised.
To control for the effects of anesthesia, being placed in the
stereotaxic instrument, skull holes, and the lowering of the injection
cannula into the brain, the behavior of the lesioned animals was
compared with that of sham-operated rats. For the rats in group sham
(n = 10), exactly the same surgical procedure was
conducted, but the injection cannula was filled with PBS alone and
lowered to the same position as for the GC group, but no fluid was injected.
Histological procedures
At the end of the experiment, the animals were killed
using a lethal barbiturate overdose and then perfused transcardially with 0.9% saline followed by 10% formalin solution. The brains were
stored in 10% formalin solution for 48 hr before being transferred to
a 25% sucrose solution. Over a period of days, the brains were allowed
to sink in the sucrose solution before 60 µm frozen coronal sections
were cut throughout the region of the GC, mounted on glass slides, and
stained with cresyl violet. Slides were examined for extent of lesion
by microscopically examining sections with reference to the stereotaxic
atlas of Paxinos and Watson (1986) . Histological assessment was
conducted by comparing lesioned brains with the sham brain and by
looking for the following features: gross morphological changes, such
as holes and tissue collapse; the position and extent of gliosis and
scarring; the cannula tract and injection placement; and signs of
neuronal cell body shriveling and loss.
Procedure
Except where indicated, the rats were run twice daily in five
squads of four, each squad containing subjects from both lesion conditions, counterbalanced for operant box.
Instrumental acquisition. Three days after the shift to a
food deprivation schedule, the behavioral phase began with 3 d of magazine training in the operant chambers. Pellets and maltodextrin were delivered noncontingently into the magazine on an random time 30 sec schedule in two separate sessions. Throughout the experiment, each session began with the onset of the house light and
terminated with its offset after 20 min. The assessment of instrumental
acquisition began on day 4. Action-outcome assignment was
counterbalanced such that, for five animals in group GC and five in
group sham, pressing the lever delivered the pellets and pulling the
chain delivered the maltodextrin; for the remaining animals in each
group, the action-outcome assignments were reversed. Throughout
training, the rats were given two separate training sessions each day:
one on the lever alone and the other on the chain alone, with the
action that was trained first on each day alternating from one day to
the next. At no stage were the lever-pressing and chain-pulling actions
explicitly shaped by the experimenter. During this phase, animals were
trained on a fixed interval (FI) 20 sec schedule, and this
training continued on each action until each animal had earned 100 of
each outcome, at which point the acquisition phase terminated. The
effect of the delivery of each outcome on the number of actions
performed before the delivery of the next outcome was used as a measure
of the rate of acquisition.
During the acquisition phase, two animals in group sham and two in
group GC failed to acquire lever pressing and/or chain pulling and were
discarded from the experiment. In both groups, the animals were
assigned to different action-outcome relationships, and so the groups,
both now with n = 8, remained balanced for this factor
(four in each action-outcome assignment per group).
Instrumental training. On the day after the acquisition
assessment was terminated, animals were trained to lever press and chain pull on a constant probability schedule that delivered the appropriate outcome with a fixed probability for the first response in
each second again with each action trained separately in each session.
This probability was 0.25 in the first session, 0.1 in the following
two sessions, and 0.05 in the next eight sessions. Action-outcome
assignment was the same as that in the acquisition phase. Again, at no
stage were the lever-pressing and chain-pulling actions explicitly
shaped by the experimenter. This constant probability schedule
approximates to a random ratio (RR) schedule with a mean ratio
parameter increasing from 4 through 10, to 20 responses.
Outcome devaluation: extinction test. The devaluation
treatment was conducted on the day after the final instrumental
training session. This was accomplished by prefeeding the rats with one of the two outcomes for 60 min in the feeding cages. The allocation of
outcomes to each rat for the prefeeding was counterbalanced within each
lesion group, for both the action whose outcome was devalued (i.e.,
lever vs chain) and the outcome devalued (i.e., pellets vs
maltodextrin). Thus, in both group lesion and group sham, for four
rats, the lever outcome was devalued, whereas for four, the chain
outcome was devalued; and for four rats, pellets were devalued, whereas
for four rats, maltodextrin was devalued. Immediately after this
treatment, the rats were placed in the operant chambers for the choice
extinction test. In this test, both the lever and the chain were
available, but neither of the two outcomes was delivered.
Outcome devaluation: reward test. The day after the
extinction test, the animals were retrained on the two manipulanda in separate sessions on the RR 20 schedule. On the next day, the rats were
given a reward test conducted with both the levers and chains present.
This test differed procedurally from the devaluation treatment and
extinction test only to the extent that the two outcomes were delivered
as a consequence of instrumental performance. In this session, the two
outcomes were delivered on independent ratio schedules, with each
outcome earned on an RR 20 schedule (i.e., with a probability of 0.05).
Before this second test, the rats consumed the same outcome that they
had been given before the extinction test for 1 hr in the feeding cages.
Statistics
In this experiment and in all subsequent studies, an level of 0.05 was used to assess the statistical significance of the data analyses.
Experiment 2
Subjects and apparatus
The subjects and apparatus were the same as those used in
experiment 1.
Procedure
After the reward test of experiment 1, all rats received two
sessions of retraining on each of the two manipulanda in separate sessions with each outcome delivered with a probability of 0.05 for the
first response in each second, as in the training phase of experiment
1. On the following day, the contingency assessment began. The rats
continued to be trained on the two manipulanda with the appropriate
paired outcome delivered with a probability of 0.05 in separate 30 min
sessions each day. They earned the same outcomes as in experiment 1, but, in addition, one of the two outcomes was also delivered unpaired
in each of the sessions, such that one of the action-outcome
contingencies was degraded and the other was not. Thus, for each
subject, the unpaired outcome was the same as the paired outcome in one
of the daily sessions and different from the paired outcome in the
other. These unpaired outcomes were also delivered with a probability
of 0.05 but in each second without a response. Within each group, the
type of unpaired outcome (food pellets vs maltodextrin solution)
delivered was counterbalanced with respect to the action-outcome
assignment. Thus, for half of the animals trained to lever press for
pellets and to chain pull for maltodextrin, the unpaired outcome was
pellets, whereas for the other half, it was maltodextrin, and likewise for the animals that earned maltodextrin on the lever and pellets on
the chain. The contingency assessment lasted for four sessions on each
action conducted on successive days using this schedule. On the fifth
day, responding on both manipulanda was extinguished in separate
sessions in the absence of any outcomes.
Unless otherwise stated, the procedures used in experiment 2 were the
same as those used in experiment 1.
Experiment 3
Subjects and apparatus
Subjects were 20 experimentally naive male Hooded Lister rats
(Harlan Olac). They were housed and maintained as described in
experiment 1.
The apparatus used for instrumental training and testing was the same
as that described for experiment 1. For the incentive learning phase
(see below), the eight feeding cages described in experiment 1 were
used. Again, pellets were given in small glass dishes placed inside
these cages, whereas maltodextrin was given through calibrated drinking
tubes inserted through a hole in the wire mesh ceiling.
Surgical procedures
The surgical procedures for both the sham and lesioned groups,
including lesion coordinates and the concentration and volume of the
quinolinic acid infused, were exactly as described in experiment 1.
Histological procedures
Histological procedures were exactly as described in experiment 1.
Procedure
The experiment was conducted in four stages: (1) instrumental
training; (2) incentive learning; (3) extinction test; and (4) reacquisition tests. Except where indicated, the rats were run twice
daily in five squads of four, each squad containing subjects from both
lesion conditions, counterbalanced for operant box. Before the start of
instrumental training, rats were placed on the 22.5 hr food deprivation
used in the previous studies.
Instrumental training. Magazine training was conducted as
described for experiment 1. On the 2 d after magazine training, animals were trained to chain pull and then to lever press with the
appropriate reinforcer delivered on a continuous reinforcement schedule. Action-outcome assignments were counterbalanced such that,
for five animals in each lesion group, lever pressing delivered food
pellets and chain pulling delivered maltodextrin. The remaining animals
in each group received the opposite action-outcome assignment. After
initial acquisition, animals were trained on the fixed probability schedules of reinforcement as described in experiment 1. The
probability of reinforcement was again 0.25 in the first session (i.e.,
RR 4), 0.1 in the following two sessions (i.e., RR 10), and 0.05 in the
next eight sessions (i.e., RR 20). Again, at no stage were the
lever-pressing and chain-pulling actions explicitly shaped by the
experimenter. Each session began with the onset of the house light and
terminated with its offset after 20 min.
During this phase, two animals in each lesion group (one from each
action-outcome assignment in each group) failed to acquire one or other
action and so were dropped from the experiment. Consequently, both the
sham and lesioned groups were reduced to eight subjects per group.
Incentive learning. On the day after the final training
session, the incentive learning phase began and continued for 6 d. Animals received exposure to both outcomes with one reexposed in a high
deprivation state (i.e., 22.5 hr) and the other reexposed in the low or
undeprived state (i.e., 0 hr) with three sessions of reexposure given
to each outcome. The reexposure treatment was counterbalanced within
both lesion groups, both with respect to which outcome was reexposed in
the undeprived state and for the order of deprivation conditions. To
achieve this counterbalancing, for half of the animals assigned to each
action-outcome condition in each group, the pellets were reexposed in
the low deprivation state and the maltodextrin reexposed in the high
deprivation state. The remaining animals in each group received the
opposite deprivation state-outcome assignment. Furthermore, half of the
animals in each deprivation state-outcome assignment condition were
deprived in a low-high-low-high-low-high order of deprivation
conditions, whereas the remaining animals were given the reverse order.
Thus, for the first session of reexposure, half of the animals in each group were maintained on the 22.5 hr food deprivation schedule after
the final session of instrumental training. The remaining animals were
given ad libitum access to their maintenance diet in their
home cage from ~2 hr after the final training session until the first
reexposure session the next day. After the first reexposure session,
animals given this exposure when food deprived were now allowed
ad libitum access to the maintenance diet in their home cage
from ~2 hr after the reexposure session until the second reexposure
session the next day. Animals given the first reexposure when
undeprived were then deprived of food until the second reexposure
session the next day. In this manner, six sessions of reexposure were
given, three when food deprived and three when undeprived, with the
deprivation conditions alternating in counterbalanced order.
The reexposure sessions themselves were each of 15 min duration. When
animals were reexposed to the pellets, they were placed in the feeding
cages, with the glass dishes each containing 50 of the food pellet
outcome. When animals were reexposed to the maltodextrin solution, they
were placed in the feeding cages for 15 min but the drinking tubes
containing 50 ml of maltodextrin were attached to the cages for the
final 5 min of this period only. After the final session of reexposure,
all animals were given ad libitum access to their
maintenance diet until the extinction test conducted the next day.
Outcome devaluation: extinction test. The choice extinction
test was conducted on the day after the final reexposure session with
all animals tested in the low deprivation state. This test was
otherwise conducted exactly as that described for experiment 1. Thus,
both the lever and the chain were available, but no outcomes of any
kind were scheduled during this session.
Reacquisition tests. Two hours after the extinction test,
half of the animals in each group continued to receive ad
libitum access to their maintenance diet in their home cage. The
remaining animals were deprived of food until the first reacquisition
session the next day. For this session, the performance of all of the animals was assessed on the chains and the levers in separate sessions
on the RR 20 schedule using the same procedure as the instrumental
training phase. Two hours after this session, rats that were food
deprived for the first session were given ad libitum access
to food, whereas rats that were undeprived were now food deprived until
the second reacquisition test conducted the next day. In this manner,
all animals received four reacquisition tests, two when undeprived and
two when food deprived.
 |
RESULTS |
Experiment 1
Experiment 1 was divided into four stages: (1) instrumental
acquisition; (2) instrumental training; (3) a test of specific satiety-induced outcome devaluation conducted in extinction; and (4) a
test of specific satiety-induced outcome devaluation conducted with
reward. After recovery from surgery, we maintained both group GC and
their sham operated counterparts on a food deprivation schedule and
trained them to perform two instrumental actions, lever pressing and
chain pulling, in different sessions with one action earning access to
food pellets and the other to a maltodextrin solution. To assess the
effect of the lesion on specific satiety-induced outcome devaluation,
we gave the rats 1 hr exposure to one of the outcomes, immediately
after which they were returned to the operant chambers and their
tendency to lever press and chain pull was assessed in a choice
extinction test. To assess whether any effect of the lesion observed in
the extinction test was attributable to a deficit in memory rather than
in devaluation per se, after a period of retraining, we conducted a
second, rewarded choice test immediately after prefeeding on one of the
outcomes. In this test, the action that delivered the outcome on which
the rats were sated and the action that delivered the outcome on which they were not sated were the same as in training.
On the basis of our previous findings, we anticipated that sham
animals would perform fewer of the action that, in training, delivered
the devalued outcome than of the other action in both the extinction
and the rewarded choice tests. If the GC is involved in encoding
changes in incentive value of the instrumental outcome induced by the
specific satiety treatment, lesions of the GC should be anticipated to
reduce this outcome devaluation effect but only in the test conducted
in extinction. Thus, in contrast to the extinction test, a reliable
devaluation effect should be anticipated in the rewarded choice test
when the two outcomes are actually presented and, hence, when the
relative incentive value of each outcome need not be recalled but is
available for direct assessment.
Histology
Figure 1 illustrates the largest and
the smallest areas of lesion damage observed in experiment 1, and
Figure 2 provides a photomicrograph of a
representative section taken from a lesioned rat. Generally, the lesion
was ventral to somatosensory cortex and was found to be most extensive
in the granular and dorsal agranular areas of the insular cortex.
Damage to these areas was observed throughout the rostrocaudal extent
of the GC region as defined previously (cf. Braun et al., 1982 ; Kosar
et al., 1986a ), with evidence of extensive bilateral cell loss and
gliosis centered dorsal to the rhinal sulcus and ventrolateral to the
external capsule. Damage to the ventral agranular area was also
observed but was primarily confined to the region directly ventral to
the site of infusion. Gross morphological changes became more evident as the lesion extended laterally, with holes and some tissue collapse noted in superficial cortical layers. Lesion damage was very similar in
six rats in group GC but was found to be mildly asymmetrical in the two
remaining rats, being less extensive and more medial (bordering the
claustrum) on one side relative to the other. Although this suggested
that the surface layers of the cortex may well have been spared in one
hemisphere of these animals, there was, nevertheless, considerable
overlap with the contralateral lesion as well as with the damage
observed in the other rats. For this reason and because we had no a
priori grounds for regarding this variability as critical, we did not
feel justified in dropping these animals from the study. Because we
could observe no evidence of damage to the insular cortex in any of the
rats in group sham, the behavioral data from all of the rats in group
GC and group sham that completed instrumental training were analyzed
(i.e., n = 8 per group).

View larger version (61K):
[in this window]
[in a new window]
|
Figure 1.
Experiment 1. Diagrams of coronal sections
(0.48-1.7 mm anterior to bregma) on which the extent of cell loss
observed after bilateral infusions of quinolinic acid aimed at the
gustatory region of the insular cortex has been reconstructed from
histology to reveal the largest (darker) and smallest
(lighter) regions of damage induced in group GC. The
lines drawn on the right-hand hemisphere extending
laterally from the corpus callosum (cc) and the
claustrum (CL) reflect divisions between (from the
top) primary somatosensory (S),
granular (GI), dorsal agranular
(dAI), and ventral agranular
(vAI) regions of insular cortex, respectively, as
labeled on the lowermost section. rf, Rhinal
fissure.
|
|

View larger version (105K):
[in this window]
[in a new window]
|
Figure 2.
Experiment 1. Photomicrograph showing a
Nissl-stained coronal section through the insular cortex (~1.0 mm
anterior bregma). A shows this section in low
magnification, whereas in B, the area enclosed by the
dashed rectangle in A has been enlarged
to give a clearer indication of the lesion boundaries
(arrows). cc, Corpus callosum;
rf, rhinal fissure.
|
|
Instrumental acquisition
The results of the instrumental acquisition phase are presented
for group lesion and group sham in Figure
3 separately for the acquisition of lever
pressing (left panel) and the acquisition of chain
pulling (right panel). In general, it is clear that
the FI 20 sec schedule was successful in establishing slow and orderly acquisition of the two instrumental actions as assessed by the number
of actions performed after the delivery of each outcome. Both actions
were acquired at a similar rate, and the lesion did not have any
consistent impact on either the rate of acquisition or asymptotic level
of performance of the two instrumental actions, a conclusion that was
supported by the statistical analysis.

View larger version (26K):
[in this window]
[in a new window]
|
Figure 3.
Experiment 1. The mean number of lever presses
(left) and chain pulls (right) performed
per outcome during instrumental acquisition on the FI 20 reinforcement
schedule used in experiment 1. Data are averaged across blocks of five
outcomes and are presented separately for group GC
(filled circles) and group sham (open
circles).
|
|
For this analysis, a three-way mixed ANOVA was conducted with a
between-subjects factor of group and within-subjects factors of action
and of block, averaging the performance of each action into blocks of
five outcomes. This analysis revealed no main effect of group
(F < 1) or of action
(F(1,14) = 1.79), nor was the
group × action interaction reliable (F < 1). A
significant effect of block was found
(F(19,266) = 19.85), but neither the
group × block (F < 1), action × block
(F(14,266) = 1.42), nor the three-way interaction (F < 1) was reliable.
Outcome devaluation: extinction test
The results of the extinction test are presented in Figure
4 separately for the action that, in
training, delivered the outcome subsequently devalued by specific
satiety (i.e., the devalued action) and for the action trained with the
outcome that remained valued (i.e., the valued action). The performance
on the valued and devalued actions is presented separately for group
sham (middle panel) and group GC (right
panel). The data from the final training session on the RR
20 schedule are presented in the left panel (see below for
discussion). The results from the extinction test for group sham are
clear; the performance of the devalued action was markedly reduced
compared with that of the valued action. Although performance generally
declined over the course of the extinction session, from the very first
2 min period, a striking difference in performance was evident.
Importantly, Group lesion did not show this difference. Indeed, in this
group, no clear or consistent evidence of a devaluation effect emerged
at any point during the extinction test. Again, performance declined over the course of extinction, but both actions appeared to be performed at a high but similar rate, with neither action revealing the
effects of the specific satiety devaluation treatment.

View larger version (38K):
[in this window]
[in a new window]
|
Figure 4.
Experiment 1. The number of lever presses and
chain pulls (i.e., actions) per minute during instrumental training
(left) and during the choice extinction test conducted
after one of the training outcomes was devalued by a specific satiety
treatment. Data from the extinction test are presented for group sham
(middle) and group GC (right) averaged
across 2 min periods with performance of the action that previously
delivered the pre-fed, i.e., devalued (Deval),
outcome (filled circles) presented separately
from performance of the action that had delivered the non-pre-fed,
i.e., valued (Val), outcome (open
circles) for each group.
|
|
For the statistical analysis, a three-way mixed ANOVA was conducted
with a between-subjects factor of group and within-subjects factors of
devaluation, separating performance on the devalued action from that on
the valued action, and of period, separating performance into 2 min
periods. This analysis revealed no main effect of group
(F < 1) but a main effect of devaluation
(F(1,14) = 8.96) and, most
importantly, a significant group × devaluation interaction
(F(1,14) = 12.39). Simple main effects
analysis conducted on this significant interaction revealed that,
whereas performance on the valued action did not differ between groups
(F(1,14) = 2.69), performance of the
devalued action with the animals in group GC performing at a
significantly higher rate than those in group sham
(F(1,14) = 7.26). In addition, whereas
a significant devaluation effect emerged in group sham
(F(1,14) = 21.2), no such effect was
found in group lesion (F < 1). Finally, the overall analysis revealed effects of both period
(F(9,126) = 9.04) and a
devaluation × period interaction
(F(9,126) = 5.13), confirming that
performance generally declined over the extinction session and at a
faster rate for the valued than for the devalued action.
These effects of a lesion of the GC on the outcome devaluation effect
occurred only during the test and were not present in the training
data. The data from the final training session on the RR 20 schedule
are presented in the left panel of Figure 4. As is clear
from that figure, performance between the two groups was very similar,
as was their performance on the devalued and valued actions. Analysis
of these data revealed no main effect of group or devaluation or any
interaction between these factors (F < 1).
Outcome devaluation: reward test
The results of the reward test are presented in Figure
5, again separately for devalued and
valued actions and for group sham (left panel) and
for group GC (right panel). As in the extinction test, devaluation of the instrumental outcome induced a strong reduction in the performance of the devalued action in group sham. In
contrast to the test conducted in extinction, however, this effect was
also observed in group GC. A three-way mixed ANOVA found a significant
effect of devaluation (F(1,14) = 28.9)
but no effect of group (F(1,14) = 1.77) or a group × devaluation interaction (F < 1). Furthermore, there was no effect of period nor were any of the
other interactions involving group, devaluation, or period significant
(largest F(9,126) = 1.32).

View larger version (23K):
[in this window]
[in a new window]
|
Figure 5.
Experiment 1. The number of lever presses and
chain pulls (i.e., actions) per minute during the choice reward test
conducted after one of the training outcomes was devalued by a specific
satiety treatment. In contrast to the extinction test, performance of
lever-press and chain-pull actions delivered the training outcomes on
independent random ratio schedules. Data from the reward test are
presented for group sham (left) and group GC
(right) averaged across 2 min periods with performance
of the action that previously delivered the pre-fed, i.e., devalued
(Deval), outcome (filled
circles) presented separately from performance of the action
that had delivered the non-pre-fed, i.e., valued
(Val), outcome (open circles) for
each group.
|
|
Again, this effect of outcome devaluation was found on test and was not
present in the retraining session conducted between the extinction test
and the reward test. Comparable analysis of that training session
revealed no effect of lesion, of devaluation, or any interaction
between these factors (F < 1). Rate of performance on
the devalued and valued actions, respectively, for the two groups was
as follows: group sham, 23.4 and 22.1 actions per minute; group GC,
20.1 and 21.3 actions per minute.
The results of experiment 1 accord with the taste memory
hypothesis of GC function advanced in the introductory remarks.
Although lesions of the GC had no detectable effect on the acquisition or subsequent performance of either lever pressing or chain pulling for
either outcome, they had a clear effect on the sensitivity of
instrumental performance to the effects of outcome devaluation induced
by a specific satiety treatment. Thus, in contrast to the differential
performance observed in the sham-lesioned controls, GC-lesioned rats
performed both actions at a similar rate and appeared unable to
integrate the current incentive value of the instrumental outcomes,
induced by the prefeeding, with the specific action-outcome
relationships to which they were exposed during training.
This effect of the lesion was, however, only observed in extinction and
was not found in the reward test; when the devalued outcome was
delivered contingent on performance in the reward test, the lesioned
rats demonstrated a clear preference for the action delivering the
nondevalued outcome relative to the other action. This finding is
important because it helps to rule out several alternative accounts of
the deficit observed in the extinction test. Thus, differential
performance of the lesioned rats in the reward test suggests that the
deficit in extinction was not attributable to insensitivity to the
specific satiety treatment nor was it attributable to an inability to
discriminate either the two instrumental outcomes or the two
instrumental actions. Consequently, the results of experiment 1 suggest
that the lesioned rats failed to encode the taste features of the two
instrumental outcomes with the incentive value of these outcomes and,
therefore, after sensory-specific satiety, they were unable to recall
the value of one outcome relative to the other and so formulate a
course of action when forced to rely on their memory of relative
outcome value in the extinction test. When the valued and devalued
outcomes were actually presented in the reward test, however, there was
no longer any need to rely on memory of the relative value of the two
outcomes and, therefore, the lesioned rats were able to choose appropriately.
Experiment 2
There is, however, an alternative account of the results of
experiment 1 that remains to be assessed before we can accept that the
insular cortex plays a role in taste-mediated memory of incentive
value. The devaluation effect observed for group sham in the first
study demonstrates that performance of the instrumental actions in our
procedure is, at least in part, based on knowledge of the
action-outcome contingency. If the lesioned animals in this experiment
were unable to encode this contingency, then they may have acquired
instrumental performance using only an S-R mechanism. Therefore,
performance during the extinction test would have been resistant to the
outcome devaluation treatment. This account also explains the divergent
results found in the extinction and reward tests if it is assumed that
a reduction in the reinforcing efficacy of one or other outcome through
prefeeding failed to maintain the S-R association for the response
trained with the pre-fed outcome in the reward test. Although such an
account implicates the GC in contingency learning, its logical
possibility makes it necessary to assess whether the GC-lesioned
animals are able to encode specific action-outcome relationships.
In fact, a number of investigators have established that, in intact
animals, instrumental performance is sensitive to the contingency or
causal relationship between performance of an action and delivery of
its specific outcome (Hammond, 1980 ; Colwill and Rescorla, 1986 ;
Dickinson and Mulatero, 1989 ; Corbit and Balleine, 2000 ). For
example, Corbit and Balleine (2000) found that, when the action-outcome
contingency was degraded for one of two action-outcome relationships by
delivering one outcome with an equal probability both after performance
of the appropriate action and in periods when the action was not
performed, animals selectively decreased their performance of the
action for which the contingency had been degraded but continued to
perform the other action. In experiment 2, this same procedure was used
to assess whether the rats in group GC in experiment 1 were able to
encode the action-outcome contingencies.
To assess contingency sensitivity, after retraining with the same
action-outcome assignment used in experiment 1, the rats were shifted
to a schedule in which the probability of outcome delivery after
performance of each action remained the same, but, in addition, one of
the two outcomes was also delivered noncontingently with the same
probability in each second without a response. As a consequence, one
action-outcome contingency was degraded, whereas the other remained
intact. Four sessions of contingency assessment were conducted in this
manner, after which extinction tests were conducted on both the lever
and the chain to assess any contribution to performance made by
increased satiety on one outcome relative to the other.
We anticipated, based on previous reports, that the animals in group
sham would demonstrate sensitivity to degradation of one instrumental
action-outcome contingency and reduce their performance of an action
when its outcome is the same as the outcome delivered noncontingently.
Likewise, if GC-lesioned animals are able to encode the specific
action-outcome contingencies, then we should anticipate a similar
effect to that predicted in group sham. If, however, the results of
experiment 1 reflect the fact that group GC were unable to encode
specific action-outcome contingencies, then no selective effect of this
treatment should be anticipated and so no difference should emerge in
the performance of the two actions.
The results from the contingency assessment are presented in Figure
6, separately for each of the four
sessions of this phase (left four panels) and for the
extinction test (far right panel). The data
for group sham (top panel) and group lesion
(bottom panel) are also presented separately for the
actions for which the paired and unpaired outcomes were either the same
(Same) or different (Diff). Over the four
sessions of training, animals reduced their performance of the same
action more than the different action, suggesting that the
action-outcome contingency was successfully degraded by this
manipulation. This result was further confirmed in the extinction test
in which this pattern of responding clearly persisted when no outcomes
were presented. Finally, and most importantly, the effect of degrading
the instrumental contingency was similar in both group sham and group
GC, suggesting that the lesioned animals were capable of encoding the
specific action-outcome contingencies.

View larger version (42K):
[in this window]
[in a new window]
|
Figure 6.
Experiment 2. Mean performance of
lever-press and chain-pull actions per minute, averaged over 3 min
periods, during each of the 4 d of contingency assessment
(left four panels) and during the extinction test
(right panel). Test performance is divided into
two panels: A, showing the data from group sham; and
B, showing the data from group GC. In this figure,
performance of each action is presented separately in each
panel according to whether the action-outcome
contingency has been degraded, i.e., the outcome delivered by
performing the action is the same as the one now delivered without
performing the action (Same, filled
circles) or has not been degraded, i.e., the outcome delivered
by performing the action differs from that delivered without performing
the action (Diff, open circles). In the
panel illustrating the extinction test, the previously
degraded action-outcome contingency remains designated as
Same and the nondegraded as Diff,
although no outcomes of any kind were presented in this test.
|
|
This description of the data were confirmed by the statistical
analysis. A four-way mixed ANOVA was conducted on the data from the
four sessions of contingency assessment with a between-subjects factor
of group and within-subjects factors of contingency, separating performance of the same and different actions, session, and 3 min
periods in each session. This analysis revealed a main effect of
contingency (F(1,14) = 11.90) but
neither a main effect of group nor a group × contingency
interaction (F < 1). Furthermore, simple main effects
analyses revealed a significant effect of contingency in both group
sham (F(1,14) = 6.61) and in group
lesion (F(1,14) = 5.34). In addition,
there was a main effect of session (F(3,42) = 35.29) and an interaction
between session and contingency (F(3,42) = 2.99), indicating that the
effect of contingency developed over sessions. There was also an effect
of period (F(9,126) = 8.22) and a
session × period interaction
(F(9,126) = 3.66), demonstrating that
overall performance declined within a session, with this effect being
more evident in earlier than in later sessions. None of the other
higher order interactions were significant (maximum F = 1.25).
A two-way analysis of the data from the extinction test was conducted
using factors of group and of contingency. This analysis revealed a
main effect of contingency (F(1,14) = 27.41) but not an effect of group
(F(1,14) = 1.44) or a significant
group × contingency interaction (F < 1). Simple
main effects analysis again revealed a significant effect of
contingency in both group sham
(F(1,14) = 14.10) and in group lesion
(F(1,14) = 13.34).
Overall, these data are consistent with the suggestion that GC-lesioned
animals are able to encode the action-outcome contingencies to which
they are exposed and in a manner that does not differ from sham
animals. No evidence was found in experiment 2 to support the
contention that the deficit in outcome devaluation observed in
experiment 1 in GC-lesioned rats was caused by an inability to encode
the action-outcome contingency. The pattern of results found in
experiments 1 and 2 is consistent, therefore, with the conclusion that
GC-lesioned animals are able to detect and discriminate different
actions, different outcomes, and different contingencies between
actions and outcomes. Furthermore, whereas they are able to detect
differences in the incentive value of taste features of instrumental
outcomes when these are presented, it appears that they are unable to
recall these relative incentive values and so modify performance of
their actions in the absence of the outcomes themselves. As a
consequence, these results accord with the suggestion that the insular
cortex forms a part of an incentive memory system that encodes the
incentive value assigned to food outcomes in instrumental conditioning.
Experiment 3
The results of experiments 1 and 2 clearly indicate that the GC
mediates the encoding of changes in the incentive value of the
instrumental outcome but is not centrally involved in instrumental learning per se. In experiment 3, we attempted to find additional evidence for this suggestion by exploring the effect of lesions of the
GC on another manipulation of incentive value: that induced by a
post-training shift in food deprivation.
Considerable recent evidence suggests that, rather than being directly
sensitive to changes in motivational state, the rats' instrumental
performance can be controlled primarily by the way motivational states
affect the incentive value of the instrumental outcome (for review, see
Balleine, 2000 ). Thus, for example, in rats a reduction in food
deprivation often only influences the performance of actions that gain
access to a specific nutritive outcome if they have had the opportunity
to consume the outcome after the shift, thereby allowing an evaluation
of the effect of the new motivational state on the incentive properties
of the outcome. We have referred to this as incentive learning
(Balleine, 1992 ; Dickinson and Balleine, 1994 ).
In one study, for example, Balleine and Dickinson (1994) trained
food-deprived rats to perform two actions, lever pressing and chain
pulling, on a concurrent schedule with one action earning access to
food pellets and the other to a maltodextrin solution. After training,
the rats were given a number of reexposure sessions in which they were
allowed to consume one of the instrumental outcomes in the training
state, i.e., food deprived, and the other after a period of free
feeding, i.e., in an undeprived state. The rats were then given a
single, choice extinction test between the chain and the lever when
undeprived with the test conducted in extinction. It was found that the
animals performed significantly fewer of the action that, in training,
had delivered the outcome to which they were subsequently reexposed in
the undeprived state. Balleine and Dickinson (1994) interpreted this
result as indicating that, during reexposure, the animals assigned a
low incentive value to the outcome reexposed in the undeprived state, a
differential evaluation that was then manifest in subsequent
instrumental performance during the test. Consequently, this experiment
suggests that, in instrumental conditioning, animals learn about the
consequences of their actions and that motivational control of
performance is then a matter of the way in which changes in deprivation
act to modify the incentive value of those consequences.
From this incentive learning perspective, evidence that the GC is
involved in encoding changes in the value of the taste features of the
instrumental outcome suggests that this structure may encode changes in
incentive value induced by a post-training change in primary
motivation, such as that induced by a reduction in food deprivation. As
a consequence, in experiment 3, we examined the effect of lesions of
the GC on the sensitivity of the instrumental performance of rats to
the effects of a post-training shift in primary motivation. Sham and
lesioned rats were food deprived and trained to lever press and to
chain pull, with one action delivering the food pellets and the other
the maltodextrin solution. After this training, an incentive learning
phase was conducted during which all rats were alternated between high
(22.5 hr) and low (0 hr) levels of food deprivation for 6 d. Half
of the rats in each group were exposed to the food pellets when in the
high deprivation state and maltodextrin when in the low deprivation state; the remaining animals received the opposite deprivation level:
outcome assignment. After the incentive learning phase, rats were
maintained in the low deprivation state and were then given a choice
extinction test on the levers and chains.
In the choice extinction test, we anticipated that, in accord with
previous studies (Balleine, 1992 ; Balleine and Dickinson, 1994 ),
animals in the sham group would tend to assign a lower incentive value
to the outcome to which they were reexposed when in the low deprivation
state and so, when again in that state on test, would perform fewer of
the actions that, in training, delivered this devalued outcome. If the
GC mediates the encoding of changes in incentive value induced by
shifts in motivational state, then, in the extinction test, rats with
lesions of this area should be unable to recall the changes in
incentive value induced by a shift to the low deprivation state and so
should not show any differential performance of the two actions in the test.
This result would, however, be equally consistent with an impairment in
sensitivity to the effects of a shift in food deprivation in the
GC-lesioned animals. To assess this possibility, we conducted a final
phase in which all of the rats were given four reacquisition sessions
on the lever-press and chain-pull actions with performance delivering
the outcomes assigned in training. Deprivation was alternated between
the high and low state across these sessions.
Histology
The lesions in group GC were, generally, similar to those
described in experiment 1 (Fig. 7).
Again, extensive bilateral cell loss and gliosis was observed
throughout the AP extent of the GC centered on the dorsal agranular
area and extending into the granular area ventral to somatosensory
cortex. Morphological changes were also observed to be similar to those
described in experiment 1. Once again, some asymmetry in the lateral
extent of the lesion was noted in three animals but, again, there was
considerable overlap with the location of the contralateral lesion in
each of these rats as well as with the extent of the lesion observed in
the other rats. Because similar variability did not appear to mitigate
the effects of the lesion in experiment 1, the data from all of the
animals in group GC and group sham that completed instrumental training
were analyzed (i.e., n = 8 per group).

View larger version (58K):
[in this window]
[in a new window]
|
Figure 7.
Experiment 3. Diagrams of coronal sections
(0.48-1.7 mm anterior to bregma) on which the extent of cell loss
observed after bilateral infusions of quinolinic acid aimed at the
gustatory region of the insular cortex has been reconstructed from
histology to reveal the largest (darker) and smallest
(lighter) regions of damage induced in group GC.
Abbreviations are as for Figure 1.
|
|
Extinction test
The results of the extinction test are presented in Figure
8 separately for the action that, in
training, delivered the outcome subsequently reexposed in the low
deprivation state (i.e., devalued) and for the other action trained
with the outcome reexposed in the high deprivation state (i.e.,
valued). The performance on the valued and devalued actions is
presented separately for group sham (left panel) and
group GC (right panel). As indicated by Figure 8, in
the extinction test, group sham performed the devalued action less
frequently than the valued action, thereby replicating the incentive
learning effect observed in previous studies (Balleine, 1992 ; Balleine
and Dickinson, 1994 ). The same pattern did not emerge in group GC and,
in this group, there was no difference in the performance of the two
actions at any point during the extinction test.

View larger version (20K):
[in this window]
[in a new window]
|
Figure 8.
Experiment 3. Performance of lever-press and
chain-pull actions averaged across 4 min periods in the choice
extinction test conducted after a post-training reduction in food
deprivation. Before this test, all of the rats were allowed to learn
about the effect of the shift in food deprivation on the incentive
value of one of the two food outcomes used in training by giving them
brief consummatory contact with that outcome in the new, i.e., low
deprivation, state. The effect of this treatment on choice performance
in extinction is presented for group sham (left) and
group GC (right) with the performance of the action
that, in training, delivered the outcome re-exposed in the low
deprivation state (i.e., Deval, filled
circles) plotted separately from performance of the other
action (i.e., Val, open circles) in each
group.
|
|
This description was confirmed by the statistical analysis. A three-way
mixed ANOVA was conducted with a between-subjects factor of group and
within-subjects factors of devaluation, separating performance of the
devalued action from that of the valued action, and of period,
separating performance into 2 min periods. This analysis revealed no
main effect of group (F(1,14) = 1.02)
or of devaluation (F(1,14) = 1.39)
but, most importantly, did reveal a significant group × devaluation interaction (F(1,14) = 4.76). Simple main effects analysis conducted on this
significant interaction revealed that, whereas performance on the
valued action did not differ between groups (F < 1),
performance did differ on the devalued action, with the animals in
group GC performing this action at a significantly higher rate than
those in group sham
(F(1,24) = 4.31). In addition,
whereas a significant devaluation effect emerged in group sham
(F(1,14) = 5.64), no such effect was
found in group GC (F < 1). Finally, the overall
analysis revealed an effect of period
(F(9,126) = 3.86) but no other
reliable effects.
These effects of lesions of the GC on outcome devaluation induced by a
post-training shift in primary motivation occurred on test and were not
present in the training data. Performance of the two groups was very
similar, as was their performance on the devalued and valued actions
during the final training session, and a comparable analysis of these
data revealed no main effect of group or of devaluation or any
interaction between these factors (F < 1). The mean
number of actions per minute on the manipulandum that, in training,
delivered the to-be-devalued and to-be-valued outcomes, respectively,
were as follows: group sham, 23.7 and 22.6; and group GC, 21.3 and
22.4.
Reacquisition tests
The reacquisition tests compared the effect of up and down shifts
in food deprivation on rewarded instrumental performance in sham and
lesioned rats. The results of these tests are presented in Figure
9 collapsed across sessions and session
order. From this figure, it is clear that both group sham and group GC
were sensitive to the shifts in food deprivation used in this
experiment. Thus, the actions were performed at a higher rate when they
were in a high deprivation state than when in a low deprivation state. More critically, this figure shows that the sensitivity of group GC to
these shifts in deprivation did not differ from that displayed by group
sham, indicating that the deficit in incentive learning was not
produced by general insensitivity to the effects of shifts in food
deprivation on the incentive value of foods. Finally, the performance
of the actions designated "valued" and "devalued" in the
extinction test (Fig. 9, Val, Dev) was not
differentially affected by the shifts in deprivation in either group.
Instead, it appears that the performance of both actions in group sham and group GC was strongly determined by deprivation state.

View larger version (34K):
[in this window]
[in a new window]
|
Figure 9.
Experiment 3. Rate of performance of the
lever-press and chain-pull actions in the reacquisition tests conducted
after the extinction test. Rats were shifted between high and low
levels of deprivation with the effect of these shifts assessed on the
relative performance of the devalued (Dev) and valued
(Val) actions, as designated in the extinction
test. The average rate of performance of each action was averaged
across the two sessions conducted in each deprivation state and plotted
separately for group sham (left) and group GC
(right).
|
|
To assess this description of the data, a three-way ANOVA was conducted
using a between-subjects factor of group and within-subjects factors of
deprivation state (high vs low) and devaluation (valued vs devalued
action from the extinction test). This analysis revealed a significant
main effect of deprivation state
(F(1,14) = 67.6) but no reliable main
effect of group or devaluation (largest
F(1,14) = 1.22). Furthermore, none of
the interactions involving these factors approached significance
(largest F(1,14) = 2.1).
As predicted by the taste memory account of GC function, lesions of the
GC rendered instrumental performance relatively insensitive to the
effects of outcome devaluation by a post-training reduction in food
deprivation but only when assessed in an extinction test. GC-lesioned
rats could clearly detect the changes in incentive value induced by
these shifts in deprivation with performance in the reacquisition tests
affected by deprivation state to a similar degree to that observed in
the sham rats. Therefore, no evidence was found to suggest that the
deficit in choice performance found in the extinction test was produced
by a deficit in the processes that detect changes in incentive value
induced by a shift in deprivation. Rather, the results of experiment 3 suggest that the lesioned rats fail to remember at the time of
extinction testing what they learned during the incentive learning
stage about the change in incentive value induced by shifts in
motivational state. Specifically, what they fail to remember is the low
incentive value assigned to the outcome reexposed under the low
deprivational state, and consequently they continue to perform the
action trained with this outcome at a level appropriate to a high
incentive value.
The hypothesis that GC lesions cause a deficit in encoding changes in
incentive value can be further refined based on evidence, described in
the introductory remarks to this experiment, that food deprivation
changes the incentive value of food outcomes by modifying the
palatability of their taste features (cf. Balleine, 2000 ). This
evidence suggests that the deficit in incentive learning induced by
lesions of the GC may be specific to encoding changes in taste
palatability. This view suggests that, whereas the GC is not critical
for detecting either tastes or the effect of changes in deprivation on
taste palatability, it is involved in encoding these events in memory
during consummatory contact with the instrumental outcome. As a
consequence, incentive learning is strongly attenuated in GC-lesioned
rats because the lesion renders them unable to encode and so remember
changes in the incentive value of foods induced by a shift in
deprivation when forced to do so in the choice extinction test.
 |
DISCUSSION |
The aim of this series of experiments was to explore the role of
the gustatory region of the insular cortex, the so-called gustatory
cortex, in instrumental conditioning and, more specifically, instrumental outcome devaluation effects. No evidence was found in
these experiments to suggest that this region plays a central role in
the acquisition of instrumental conditioning or in the processes that
detect and encode the instrumental action-outcome contingency.
Nevertheless, these experiments do provide clear evidence that this
region is involved in encoding changes in the incentive value of the
instrumental outcome. More specifically, the results suggest that the
gustatory cortex plays a critical role in "incentive memory,"
allowing animals to encode changes in incentive value based on changes
in the palatability of taste features of the instrumental outcome
detected during consummatory contact.
This conclusion is based on a number of findings. In experiment 1, lesions of the GC had no effect on the rate of lever-press or
chain-pull actions when these were acquired on an FI 20 sec schedule of
reinforcement, nor was evidence found that performance on the RR
schedules was affected by the lesion as the response requirement was
increased. Nevertheless, when one of the instrumental outcomes used
during this training was devalued using a specific satiety treatment, a
very potent effect of the GC lesion was found on instrumental
performance in the subsequent choice extinction test. Sham animals
could clearly recall which outcome had been devalued and could
integrate that knowledge with the action-outcome relationships encoded
during training because, on test, they performed fewer of the action
that previously delivered the now-devalued outcome. In contrast,
GC-lesioned rats showed no effect of devaluation and performed both
actions at a similar rate and at a rate that was comparable with the
valued action in group sham.
Importantly, this effect of the lesion only emerged when the test was
conducted in extinction. When the effect of specific satiety-induced
outcome devaluation was assessed in a reward test, i.e., a test
conducted in the same manner as the extinction test except the outcomes
were now delivered as consequence of instrumental performance, no
effect of the lesion was detected. Thus, although GC-lesioned rats
showed no devaluation effect in extinction, in the reward test they
reduced performance of the action that delivered the devalued outcome
relative to the other action in a manner that was indistinguishable
from the sham group. This pattern of results is consistent with the
view that GC lesions affect the rats' ability to recall the effects of
devaluation treatments on outcome value without affecting their ability
to detect these effects, a suggestion bolstered by the results of
experiment 3. In that experiment, we used a different procedure to
devalue the instrumental outcome, i.e., devaluation by a post-training
shift in primary motivation. Rats were trained while food deprived to lever press and chain pull for different food outcomes as in experiment 1. After this training, the degree of food deprivation was reduced by
providing access to their maintenance diet ad libitum, and the effect of this shift in deprivation on performance was then assessed in an extinction test. Before the test, however, and on the
basis of our previous behavioral work, we first provided the rats with
an opportunity for incentive learning by giving them several sessions
in which they could consume small quantities of one instrumental
outcome in the low deprivation state. This treatment has been reported
selectively to devalue the outcome exposed in the undeprived state
(Balleine and Dickinson, 1994 ; Balleine, 2000 ) and, in line with this
suggestion, in extinction, rats in group sham performed fewer of the
action that, in training, delivered the outcome exposed in the low
deprivation state than of the other action. In contrast, the
GC-lesioned rats did not show a selective devaluation effect and
performed both actions at a similar rate. As in experiment 1, this
result was confined to extinction; the results of the reacquisition
tests found that the GC-lesioned rats were as able as shams to detect
the effect of shifts in food deprivation. Thus, in accord with
experiment 1, the results of experiment 3 suggest that lesions of the
GC impair the ability of rats to recall changes in incentive value but
do not appear to affect their ability to detect those changes.
Although consistent with the results of experiments 1 and 3, it remains
a possibility that, rather than mediating the memory for changes in
outcome value, the GC is critical for encoding the instrumental
action-outcome relationship during training. This account predicts
that, after outcome devaluation, GC lesions will produce a deficit in
choice performance, not because of a failure to encode or recall the
reduced incentive value of an outcome but because of a failure to
encode the causal relationship between an action and its specific
consequences. We assessed this account in experiment 2 and found clear
evidence against it. In this experiment, one action-outcome contingency
was degraded by making the delivery of the outcome after the action and
delivery of the outcome in the absence of the action equally probable. If the GC lesion renders rats unable to encode the specific
action-outcome relationships to which they were exposed in training,
the lesioned rats should have been relatively insensitive to
noncontingent outcome delivery. Nevertheless, no evidence emerged to
suggest that the lesioned rats differed from the shams in their ability to encode the instrumental action-outcome relationship. Both groups appeared to be similarly sensitive to degradation of the instrumental contingency.
The role of the GC in incentive learning
Together, the results of the current experiments suggest that, in
instrumental conditioning, the GC is involved in recalling changes in
incentive value induced by both specific satiety and a shift in primary
motivation, changes that evidence suggests are mediated by incentive
learning (Balleine, 1992 ; Balleine and Dickinson, 1998 ). Thus, it seems
reasonable to propose that the GC is involved, generally, in the
incentive learning process, i.e., a learning process through which
changes in incentive value are encoded during direct consummatory
experience. Nevertheless, this involvement is likely to be limited to
situations in which changes in incentive value are based predominantly
on changes in the evaluation of taste features of the instrumental outcome.
Balleine and Dickinson (1998) demonstrated that outcome devaluation by
specific satiety can be mediated solely by the taste of a food outcome,
which suggests that the incentive value in this situation is
predominantly assigned based on representation of its taste features in
this situation. If, therefore, an animal is unable to encode these
taste features as an aspect of the representation of the instrumental
outcome stored in memory, then the incentive value assigned to these
features should also fail to be encoded in memory, and an animal
required to establish the current incentive value of an outcome by
interrogating its encoded outcome representation should have difficulty
in doing so. This account of the results of the current experiments
suggests that the failure to find an effect of outcome devaluation
treatments in the GC-lesioned rats was not produced by either a
reduction in sensitivity to devaluation or interference with
instrumental learning processes. Rather, it is proposed that
GC-lesioned animals are unable to encode the taste features of foods or
fluids within their mnemonic representation of the instrumental
outcome. The rats appear, therefore, to be insensitive to devaluation
treatments that act by changing the palatability of taste features but
only in situations that require an animal to retrieve the current value
of an outcome from memory to formulate a course of action.
The hypothesis proposed based on this analysis is, therefore, that the
gustatory region of the insular cortex operates to encode the taste
features of the instrumental outcome as an aspect of the representation
of that outcome in memory. It appears, therefore, that the GC mediates
the encoding of one of the several sensory features (e.g., odor,
texture, temperature, and visual properties) on which representation of
the instrumental outcome and the assignment of incentive value could be
based, albeit the most salient of these elements with respect to the
assignment of incentive value to nutritive outcomes. As such, it seems
reasonable to propose that the GC acts as part of a distributed memory
system involving closely related cortical areas, such as somatosensory,
visual, and olfactory regions, damage to all of which appear to
generate specific sensory agnosias (Meyer, 1984 ; Braun, 1989 ) that
together may provide the basis for the formation of a rich sensory
representation of specific instrumental outcomes.
Finally, considerable evidence, too much to review in detail here (cf.
Dickinson and Balleine, 1994 ; Balleine, 2000 ), has accumulated to
support the view that incentive learning modifies the incentive value
assigned to food and fluid outcomes by allowing animals to learn about
a change in the affective response elicited by consummatory contact
with those outcomes. The fact that, in the current studies, rats with
lesions of the GC were capable of detecting changes in outcome value
suggests that the GC is not critically involved in the presentation of
affect. The evidence that GC-lesioned rats were unable to recall
changes in value does, however, suggest that, after outcome
devaluation, the GC functions to allow the changes in affective
response elicited by foods and fluids to be encoded with the taste
features of those events. Thus, although changes in incentive value
based on changes in the palatability of taste features induced by
devaluation treatments are not detected by the GC, it appears likely
that it is involved in encoding those taste features in memory. As
such, encoding of the incentive value of the outcome in memory must
require, at the v |