Abstract
An abundant literature has highlighted the importance of the nucleus accumbens core (NAcC) in behavioral tasks dependent on external stimuli. Yet, some studies have also reported the absence of involvement of the NAcC in stimuli processing. We aimed at comparing, in male rats, the underlying neuronal determinants of incentive and instructive stimuli in the same task. We developed a variant of a GO/NOGO task that reveals important differences in these two types of stimuli. The incentive stimulus invites the rat to engage in the task sequence. Once the rat has decided to initiate a trial, it remains engaged in the task until the end of the trial. This task revealed the differential contribution of the NAcC to responding to different types of stimuli: responding to the incentive stimulus depended on NAcC AMPA/NMDA and dopamine D1 receptors, but the retrieval of the response associated with the instructive stimuli (lever pressing on GO, withholding on NOGO) did not. Our electrophysiological study showed that more NAcC neurons responded more strongly to the incentive than the instructive stimuli. Furthermore, when animals did not respond to the incentive stimulus, the induced excitation was suppressed for most projection neurons, whereas interneurons were strongly activated at a latency preceding that found in projection neurons. This work provides insight on the underlying neuronal processes explaining the preferential implication of the NAcC in deciding whether and when to engage in reward-seeking rather than to decide which action to perform.
SIGNIFICANCE STATEMENT The nucleus accumbens core (NAcC) is essential to process information carried by reward-predicting stimuli. Yet, stimuli have distinct properties: incentive stimuli orient the attention toward reward-seeking, whereas instructive stimuli inform about the action to perform. Our study shows that, in male rats, NAcC perturbation with glutamate or dopamine antagonists impeded responses to the incentive but not to the instructive stimulus. NAcC neuronal recordings revealed a stronger representation of incentive than instructive stimuli. Furthermore, we found that interneurons are recruited when rats fail to respond to incentive stimuli. This work provides insight on the underlying neuronal processes explaining the preferential implication of the NAcC in deciding whether and when to engage in reward-seeking rather than to decide which action to perform.
Introduction
The nucleus accumbens core (NAcC) is unequivocally recognized as central to incentivize actions prompted by reward-predictive stimuli (Cardinal et al., 2002; Nicola, 2007, 2016; Yin et al., 2008; Floresco, 2015). NAcC lesion, inactivation or perturbation of dopamine transmission all reduce behavioral responding to incentive stimuli in Pavlovian (Di Ciano et al., 2001; Saunders and Robinson, 2012; Clark et al., 2013; Fraser and Janak, 2017), general Pavlovian-to-instrumental (Corbit et al., 2001; Corbit and Balleine, 2011), and discriminative stimulus tasks (Yun et al., 2004; Ambroggi et al., 2008, 2011; Nicola, 2010).
By analogy with the dorsal striatum, many authors consider that the principal function of the NAcC is to form associations between the stimuli and the outcomes they predict to select actions leading to the most valuable outcome (Nicola, 2007; Yin et al., 2008; Humphries and Prescott, 2010; van der Meer and Redish, 2010; Khamassi and Humphries, 2012; Schultz, 2016). This hypothesis is supported by electrophysiological data showing that NAcC neurons encode the reward-predictive value of stimuli (Setlow et al., 2003; Nicola et al., 2004b; Ambroggi et al., 2008; Roesch et al., 2009; Goldstein et al., 2012; Nakamura et al., 2012; Bissonette et al., 2013; Strait et al., 2015; Sleezer et al., 2016; Morrison et al., 2017; Yoo et al., 2018). Hence, within this action selection framework, the reduced performance induced by NAcC perturbation is interpreted as a failure to retrieve from memory the motor program associated to the reward-predicted stimulus.
However, other pharmacological or lesional studies showed that the NAcC or its dopamine input are not always necessary for animals to discriminate between stimuli predicting differently valued outcomes or instructing different actions to perform (Amalric and Koob, 1987; Cole and Robbins, 1989; Robbins et al., 1990; Hauber et al., 2000; Giertler et al., 2004; Floresco et al., 2006; Calaminus and Hauber, 2007; Castañé et al., 2010; Ghods-Sharifi and Floresco, 2010). Hence the NAcC may not be involved in assessing the value of the predicted outcome but rather in mediating the incentive motivation to engage in reward-seeking in response to stimuli.
These seemingly contrasting data emphasize the importance of different types of stimuli used. Indeed, in NAcC-dependent tasks, the stimuli prompt rats to initiate reward-seeking actions. In this situation, stimuli reorient the attention of the animal and cause a switch from spontaneous behaviors (e.g., grooming, sniffing) to task performance (Nicola, 2010). Such stimuli are said to have incentive properties (Berridge, 2004, 2012). In contrast, in NAcC-independent tasks, rats have to engage in actions to trigger stimuli presentations. Here, stimuli are temporally expected and do not elicit action initiation; they are used as instructions and only provide information regarding the upcoming outcome or the action to perform.
Although numerous studies have recorded NAcC neuronal responses to different types of stimuli, there is no study that explicitly compared incentive and instructive stimuli responses of NAcC neurons. Here, in a GO/NOGO task that sequentially uses these two types of stimuli, we show that incentive stimuli (INC) activate more NAcC neurons than instructive stimuli and that behavioral responding to the former but not the latter is dependent on glutamate and dopamine transmission. Further, we show that the absence of responding to incentive stimuli engages a population of NAcC interneurons that responds to stimuli before projection neurons, suggesting that interneurons oppose the invigorating effect of dopamine and glutamate afferent signals.
Materials and Methods
Animals.
The subjects were male Long–Evans rats (Harlan Sprague Dawley) weighing ∼300 g on arrival and individually housed on a 12 h light/dark cycle. Experiments were conducted during the dark phase. Throughout all experiments, food restriction (beginning 1 week before training) was adjusted daily at the end of experimental manipulations to maintain the rats at ∼90% of their initial body weight. The experiments were performed in accordance with the guidelines on animal care and use of the National Institutes of Health of the United States, European guidelines (European Community Council Directive, 2010/63/UE), and National guidelines.
GO/NOGO task.
This study was conducted in operant chambers (23 × 30 cm) containing two retractable levers, a reward receptacle located between them on one wall of the chamber, two house lights, a blue light located in the reward receptacle, a white noise speaker and a tone speaker (Med Associates).
Rats were run daily on the GO/NOGO task for 2 h. Two stimuli instructed the rat to either respond or withhold a response to receive a liquid sucrose reward (10%, 75 μl, paired with a 100 ms illumination of the reward receptacle light) delivered into a well of the reward receptacle.
The intertrial interval was 15 s. A trial started with presentation of the incentive stimulus. This consisted of extension of the left lever (hereinafter called the initiation lever) for up to 10 s. A press on the initiation lever triggered the presentation, after a variable delay (0.8–2 s, 1.3 s in average), of one of two instructive stimuli consisting of 1 s tones (an intermittent tone at 4 kHz and a siren tone ramped from 4 to 8 kHz with a 400 ms period, presented randomly with a probability of 0.5). We refer to these instructive stimuli as GO and NOGO stimuli. Upon termination of these stimuli, the right lever (hereinafter called the response lever) extended for 1.5 s. Rats were rewarded if they pressed or did not press the response lever after GO and NOGO stimuli, respectively. Importantly, on correct trials, the reward was delivered 3 s after the onsets of both GO and NOGO stimuli. Pressing the response lever resulted in the retraction of the lever on GO trials but did not affect the time of reward delivery. This feature is essential to prevent any reward discounting effects between GO and NOGO trials. If the rat did not press the response lever on GO trials, it retracted after 1.5 s and the intertrial interval was reinitiated. Lever pressing on the response lever on NOGO trials had no consequence on reward delivery or on the retraction of the lever. The initiation lever was always located to the left and the response lever to the right of the receptacle, but GO and NOGO stimuli (intermittent and siren tones) were randomly assigned and remained constant for individual rats throughout training and experiments.
Behavioral training.
Training was conducted in multiple steps. First, rats were trained to respond to the initiation lever: the lever was presented for 10 s every 15 s and every lever-press triggered sucrose delivery. Once rats attained a stable level of responding (i.e., >80% responses, 4–9 d), rats moved to the next step where the GO stimulus was introduced. At this stage, the task was similar to the final task but only GO stimuli were presented. Once rats responded reliably to the GO stimulus (>80% responses, 6–15 d), the NOGO stimulus was introduced. During this step, errors (i.e., not responding on GO or responding on NOGO trials) triggered the repetition of the same instructive stimulus on the next trial to prevent perseverative behaviors. Once rats responded accurately to GO and NOGO cues (>80% correct responses, 17–32 d), the repetition of error trials was no longer used and rats underwent surgery for electrodes or cannulae implantations. Electrophysiological and pharmacological experiments were conducted after 4–6 d of retraining without repeating error trials.
Surgeries.
For bilateral electrophysiological recordings in the NAcC, two 8-electrode arrays (NB Labs; 50 μm stainless steel wires arranged in 2 rows of 4) were attached to a microdrive device that allowed the entire arrays to be lowered by 80 μm increments. Target coordinates of the medioposterior electrode were as follows: AP: +1.2, ML: ±2.0, and DV: −6.5 to −8.5 mm. For the pharmacological study, rats were bilaterally implanted with microinjection guide cannulae (27 gauge, Plastics One) in the NAcC (AP: +1.5, ML: ±2, and DV; −6 mm).
Animals were anesthetized with isoflurane (5%) and placed in a stereotaxic apparatus. Anesthesia was maintained with isoflurane (0.5–2.0%) during surgery. Microdrives or guide cannulae were secured to the skull with bone screws and dental acrylic. Rats were given at least 7 d of recovery before being retrained on the task and habituated to the handling procedures.
Microinjections.
After recovering from surgery, two groups of rats were microinjected in the NAcC with the D1 receptor antagonist SCH23390 (0.5 μg/side in 0.5 μl CSF) or a mixture of the AMPA and NMDA antagonists 6-cyano-7-nitroquinoxaline-2,3-dione (CNQX) and 2-amino-5-phosphonopentanoic acid (AP5; 1 and 2 μg, respectively, in 0.5 μl CSF). Each group also received CSF control injections. The obturators were removed and 33-gauge injector cannulae were inserted into the guides. Injectors extended 1.5 mm below the tip of the cannula. A volume a 0.5 μl of CSF, SCH23390, or CNQX/AP5 was injected over 2 min. After a 1 min post-injection period, the injectors were removed, the obturators were replaced, the animal was immediately placed into the behavioral chamber and the behavioral session began. CSF or drugs (SCH23390 or CNQX/AP5) were injected on different days (with at least 1 intervening day with no injection), in random order, in the same animals.
Electrophysiology.
Electrophysiological recordings were conducted as described previously (Nicola et al., 2004b; Ambroggi et al., 2008). Animals were connected to the recording apparatus (Plexon) and run for 2 h daily sessions of the GO/NOGO task. The microdrive carrying the electrode arrays was lowered by 80 or 160 μm at the end of each session to get a new set of neurons every day.
Isolation of individual units was performed off-line with Offline Sorter (Plexon) using principal component analysis. Only units with well defined waveforms were included in this study. Interspike-interval distribution, cross-correlograms, and autocorrelograms were used to insure single units were isolated.
Histology.
Animals were deeply anesthetized with pentobarbital and perfused intracardially with saline and 4% formalin (plus 3% ferrocyanide for rats with electrode arrays). Brains were removed, sectioned (40 μm), and stained for Nissl substance to locate injection or recording sites (labeled by passing a DC current through each electrode before perfusion).
Experimental design and statistical analysis.
The criteria for inclusion in final analysis were correct electrodes or cannulae placements in NAcC. For behavioral analyses, the primary dependent variables were the response probability and latency of incentive stimulus (INC), GO, and NOGO. All analyses were conducted in MATLAB (MathWorks). For electrophysiological analyses, the primary dependent variable was the mean z-score normalized firing, the onset latency and response durations. These dependent variables were analyzed with paired, unpaired t tests or ANOVAs. A Bonferroni correction was applied to account for multiple comparisons. Proportions were analyzed using χ2 tests. Distributions were compared using Kolmogorov–Smirnov tests. All results were considered significant at p < 0.05.
For electrophysiological recordings, peristimulus time histograms (PSTHs) were constructed with 20 ms time bins. Smoothing (LOWESS method, span = 6) was used only for display purposes. PSTHs constructed around the behavioral events were used to detect excitations and inhibitions and the time at which they occurred. The 0.5 s period before the event was used as a baseline period. Excitation and inhibition to each event was determined by the presence of at least 4 bins above the 95% (for excitations) or below the 10% (for inhibitions) confidence interval of the baseline during the analysis window (−250 to 250 ms around the event considered). Onset was determined by the time of the first of four consecutive bins falling outside the confidence interval. The offset was determined in analogy, by searching the first of five consecutive bins within the confidence interval.
Color-coded maps and average PSTHs across neurons were constructed with smoothed 20 ms bins. Before averaging, the firing rate of each neuron during each bin was transformed to a z-score using 0.5 s preceding the event as a baseline.
Code/data accessibility.
The data and analysis routines used in this study are available upon request.
Results
Behavioral analysis
We used a behavioral task in which we sequentially presented two different types of stimuli to which the animals had to respond to obtain a sucrose reward (Fig. 1A,B). The INC, presented after an intertrial interval of 15 s, corresponded to the extension of the initiation lever (producing a 300 ms sound) and informed the animal it could engage in a trial by pressing the lever within 10 s. This triggered, after a short delay (0.8–1.6 s), the presentation of one of two instructive stimuli with equal probability. These 1 s GO and NOGO auditory stimuli informed the rat to later press or not press, respectively, a second lever (called the response lever) that was presented for 1.5 s immediately after the termination of the instructive stimuli. Upon a correct response, a reward was delivered 3 s after the onset of either instructive stimulus. Hence, the two stimuli instructed opposite instrumental responses, but their reward value was identical, being of the same magnitude and delivered after the same delay.
Behavioral performance in the GO/NOGO task. A, Task diagram showing the sequence of events during a trial. B, Temporal structure of the task. Numbers on top relate to the event sequence presented in A. C, Average correct response probabilities (number of responses/number of stimuli presented) for the three stimuli. For INC stimuli, a response was considered correct if the animal pressed the initiation lever within 10 s. For GO trials, a response was considered correct if there was a lever press during the presentation of the response lever. For NOGO trials, a response was considered correct if there was no lever press during the presentation of the response lever. Gray lines represent individual values. *p < 0.05 (Bonferroni post hoc test). D, Average response latency (time between the initiation lever extension to the initial lever press and between GO or NOGO onsets to the press on the response lever). For NOGO trials, the latency was computed for error trials. Gray lines represent individual values. *p < 0.05 (Bonferroni post hoc test). E, Relative timings of all behavioral events for correct and error trials. All events are aligned to GO and NOGO onsets. Left and right edges of the blue rectangle represent the onset and offset [triggered by the initiation lever press (Initi LP)] of the INC stimulus, respectively. Gray rectangles represent the periods where the response lever is extended. The light brown line represents the times in which the animal had its head in the reward receptacle. The left edge of the yellow rectangles marks the onset of the reward delivery and the right edge, the exit of the reward receptacle. F, Average response probability to the INC stimulus (10 min bins). Gray lines represent individual values. +p < 0.1, *p < 0.05, *p < 0.01 (Bonferroni post hoc tests). G, Average response latency to the INC stimulus (10 min bins). Gray lines represent individual values. +p < 0.1, *p < 0.05 (Bonferroni post hoc tests). H, Average response probability to the GO stimulus (10 min bins). Gray lines represent individual values. I, Average response latency to the INC stimulus (10 min bins). Gray lines represent individual values.
Over 2 h sessions, rats initiated on average, 300 ± 5.87 trials, leading to a probability of responding to the INC stimulus of 0.90 ± 0.01 and an average response latency of 1.34 ± 0.07 s (Fig. 1C,D; ANOVA, F(2,159) = 7.23, p = 0.001 and F(2,131) = 97.73, p = 1.06 × 10−26, respectively). Although this was never required, rats spontaneously and almost systematically (probability 0.98 ± 0.004) entered the reward receptacle shortly after the initiating lever press and before the presentation of the instructive stimuli (0.55 ± 0.01 s after the lever press; Fig. 1E). Furthermore, when rats pressed the initiating lever, they remained in the reward receptacle until they received the feedback from the reward for most of the trials (probability 0.93 ± 0.02). Overall, rats had a slightly lower accuracy for GO than NOGO trials (Bonferroni post hoc test, p < 0.05; Fig. 1C). On GO trials, rats exited the reward receptacle during the presentation of the GO stimulus (0.76 ± 0.03 s), pressed the response lever (1.48 ± 0.02 s after GO onset), and re-entered the receptacle (1.87 ± 0.02 s after GO onset) to collect the reward. On NOGO trials, rats remained in the receptacle until reward collection and consumption.
On GO error trials, rats remained in the receptacle during the stimulus and the response lever extension period as they did on NOGO correct trials (Fig. 1E). Rats exited the receptacle earlier on GO error trials than on NOGO correct trials (1.34 ± 0.08 s after the time the reward should have been delivered). On NOGO error trials, rats displayed a behavior similar than that displayed on GO correct trials. They exited the receptacle slightly earlier than on GO correct trials (GO correct, 0.76 ± 0.03 s, NOGO error, 0.62 ± 0.05, unpaired t test, t(72) = 2.49, p = 0.015). However, the latency to lever press the response lever was similar (GO correct, 1.48 ± 0.02 s, NOGO error, 1.53 ± 0.04, unpaired t test, t(78) = −0.99, p = 0.32).
Hence, the behavioral sequence was similar between correct and error trials of the opposite stimulus. When rats made an error, they remained in the receptacle until the reward should have been delivered, suggesting that they use the feedback from reward delivery to detect their errors.
These results indicate that rats treated the incentive and instructive stimuli differently. Upon presentation of the INC stimulus, when rats decided to commit to the task and pressed the initiating lever, they rarely disengaged from the whole behavioral sequence.
Over the session, the likelihood to respond to the INC stimulus decreased and the latency increased (Fig. 1F,G; repeated-measures ANOVA, F(11,176) = 10.071, p = 4.06 × 10−14 and F(11,132) = 9.326, p = 3.080 × 10−12, respectively), indicating that motivation to engage in the task was reduced over the long sessions. However, we found that responding to the GO stimulus did not vary over the course of the session (Fig. 1H,I; repeated-measures ANOVA, F(11,143) = 0.979, p = 0.468 and F(11,132) = 1.177, p = 0.309, respectively). Thus, the motivation level specifically impacted the engagement in the task by responding to the INC stimulus but not the subsequent response to the instructive stimulus.
NAcC AMPA/NMDA and dopamine D1 receptors are necessary to respond to incentive but not instructive stimuli
We first sought to confirm that the GO/NOGO task captured the implication of the NAcC in the two different types of stimuli. We expected that manipulating NAcC activity would affect responding to the incentive but not to the instructive stimuli.
We locally injected into the NAcC either the selective AMPA and NMDA antagonists CNQX and AP5 or the selective D1 receptor antagonist SCH23390, which have been found to affect behavioral responding to incentive stimuli (Yun et al., 2004; Ambroggi et al., 2008, 2011).
We trained two groups of rats (n = 9 and n = 6) on the GO/NOGO task and implanted guide-cannulae in the NAcC (see Materials and Methods, Histology; Fig. 2A). The first group of rats was injected bilaterally with either CSF or SCH23390 just before being run on the task. Blocking NAcC D1 receptors significantly reduced the response probability to the INC stimulus (repeated-measures ANOVA, stimulus effect: F(2,12) = 6.38, p = 0.013; drug effect: F(1,6) = 03.768, p = 0.100; interaction: F(2,12) = 6.534, p = 0.012; Bonferroni post hoc test, INC stimulus CSF vs SCH23390, p = 0.04) but had no effect on either the GO or the NOGO stimulus (Bonferroni post hoc tests, p = 1.00; Fig. 2B). The analysis of response latencies revealed no interaction effect but showed that SCH23390 induced an overall increase in latencies (repeated-measures ANOVA, stimulus effect: F(2,8) = 4.123, p = 0.059; drug effect: F(1,4) = 8.445, p = 0.044; interaction: F(2,8) = 1.44, p = 0.292; Fig. 2C). The second group of rats was injected bilaterally with either CSF or a mixture of CNQX/AP5. We found that blocking glutamate transmission selectively affected the likelihood of responding to the INC stimulus (repeated-measures ANOVA, stimulus effect: F(2,10) = 2.344, p = 0.146; drug effect: F(1,5) = 16.611, p = 0.010; interaction: F(2,10) = 5.889, p = 0.020; Bonferroni post hoc test CSF vs CNQX/AP5, INC, p = 0.003, GO, p = 1.00, NOGO, p = 1.00; Fig. 2D) without significantly affecting response latency (repeated-measures ANOVA, stimulus effect: F(2,8) = 7.835, p = 0.013; drug effect: F(1,4) = 3.34, p = 0.142; interaction: F(2,8) = 1.779, p = 0.23; Fig. 2E).
Effect of glutamate and dopamine D1 receptor antagonists on performance in the GO/NOGO task. A, Histological reconstruction of cannulae placements shown on coronal sections. SCH23390 group, purple; CNQX/AP5 group, brown. B, Effect of SCH23390 on average correct response probabilities for INC, GO, and NOGO stimuli. Gray lines represent individual animals. *p < 0.05 (Bonferroni post hoc test). C, Effect of SCH23390 on average response latencies. For NOGO, latencies were computed on error trials. Gray lines represent individual animals. D, Effect of CNQX/AP5 (CNAP) on average correct response probabilities for INC, GO, and NOGO stimuli. Gray lines represent individual animals. **p < 0.01 (Bonferroni post hoc test). E, Effect of CNQX/AP5 on average response latencies. For NOGO, latencies were computed on error trials. Gray lines represent individual animals.
Thus, blocking NAc AMPA/NMDA or D1 receptors reduced the engagement in the task in response to the INC stimulus but did not alter the ability of rats to decide which action to perform in response to the instructive stimuli.
NAcC neurons encode the incentive stimulus more strongly than the instructive stimuli
Our behavioral analysis suggests that the incentive stimulus gives the rat the opportunity to decide whether to engage in reward seeking, while instructive stimuli provide information regarding the type of behavioral response to perform on the response lever. We therefore sought to understand the underlying neuronal processes by which the NAc is necessary for incentive but not instructive stimuli. We investigated the neuronal representation of the stimuli used in this task by recording the activity of multiple single neurons in the NAcC (434 neurons recorded from 64 sessions in 10 rats; see Materials and Methods, Histology; Fig. 3).
Histological reconstruction of electrode placements shown on coronal sections.
As previously described (Nicola et al., 2004a,b; Ambroggi et al., 2011), NAcC neurons responded to many different task events, including stimuli, actions and rewards. We focused our analysis on responses to stimuli (Fig. 4). All three stimuli evoked excitatory responses that occurred shortly after stimulus onset. Excitation onset latencies were similar across the three stimuli (INC 69 ± 6 ms, GO 75 ± 16 ms, NOGO 77 ± 7 ms, Kolmogorov–Smirnov test, p > 0.25 for all comparisons; Fig. 4F). The durations of the responses were variable with similar distributions (INC 1.85 ± 0.27 s, GO 1.57 ± 0.49 s, NOGO 1.74 ± 0.56 s, Kolmogorov–Smirnov test, p > 0.18 for all comparisons; Fig. 4F).
NAcC neurons respond more strongly to the incentive than the instructive stimuli. A, Heatmaps showing stimuli-evoked activity of all excited neurons. Each row represents an individual neuron's response to the stimulus averaged across all correct trials during the session. Neurons are sorted from top to bottom by the duration of their response. Colors indicate the z-score-normalized firing rate; hot colors represent excitation (z-score > 0), cool colors represent inhibition (z-score < 0). Data are plotted with 20 ms bins and smoothed with a LOWESS method (span = 6) for display purposes. ILP and RLP represent the average time of the initiation and response lever presses, respectively; En and Ex, average time of the entry and exit into the reward receptacle, respectively. B, Average PSTH for all neurons shown in A. Lines represent the mean response; shaded regions represent ± SEM. C, Same representation than in A but for stimuli-inhibited neurons. D, Same representation than in B but for stimuli-inhibited neurons. E, Percentage and average z-score (for the first 250 ms following stimulus onset) for excited (top) and inhibited (bottom) neurons. The number of neurons responsive to stimuli are indicated in the bar graphs. **p < 0.01 (Bonferroni post hoc tests). F, Response onsets and durations for INC, GO and NOGO stimuli excitations (top) and inhibitions (bottom). G, Percentage of neurons excited (purple) or inhibited (yellow) or nonresponsive (gray) to the instructive stimuli for all neurons, neurons nonresponsive to the INC stimulus, neurons excited by the INC stimulus, and neurons inhibited by the INC stimulus. The number of neurons are indicated in the pie charts.
A striking difference between the incentive and the instructive stimuli was the size of the populations responding to these events. The INC stimulus evoked excitatory responses in 3 times more neurons than GO or NOGO stimuli (χ2 = 40.7, p < 0.0001; Fig. 4A,E). Furthermore, the average response magnitude (0–250 ms poststimulus) was considerably larger to the INC stimulus than to the GO or NOGO stimuli (Fig. 4A,B,E). This effect did not depend on differences between neurons recorded in different animals or the hemisphere they were recorded from (three-way ANOVA, cue effect: F(2,119) = 5.210, p = 0.007; subject effect: F(8,119) = 0.599, p = 0.777; recording side effect: F(1,119) = 1.197, p = 0.276). This result indicates that the type of behavioral response to incentive or instructive stimuli overcomes the preferential selectivity for contralateral actions (Roesch et al., 2009).
In more than half of the instructive stimulus-inhibited neurons, the inhibitions began before the onset of the stimuli (Fig. 4F) and were therefore not driven by the stimulus itself. Response durations were different, with prolonged inhibitions to the NOGO stimulus compared with the INC and GO stimuli (INC 2.36 ± 0.3 s, GO 2.49 ± 0.5 s, NOGO 4.12 ± 0.45 s, Kolmogorov–Smirnov test, INC vs GO, p = 0.094, INC vs NOGO, p = 0.027). For the INC stimulus, the population of inhibited neurons was significantly larger than those of the GO or NOGO stimuli (χ2 = 8.45, p = 0.015; Fig. 4C–E). However, the average response magnitude was similar between all three stimuli (Fig. 4E; three-way ANOVA, cue effect: F(2,199) = 7.362, p = 0.173; subject effect: F(9,199) = 0.747, p = 0.665; recording side effect: F(1,199) = 1.316, p = 0.253).
We then investigated the relationship between the different neuronal populations observed (Fig. 4G). For this analysis, we grouped the neurons that were excited (or inhibited) by either one or both instructive stimuli. We then calculated the percentage of instructive stimulus-excited (or -inhibited) neurons for all neurons, neurons that were nonresponsive to the INC stimulus, neurons that were excited by the INC stimulus, and neurons that were inhibited by the INC stimulus. We found that both instructive stimulus-excited and -inhibited neurons were slightly over-represented in the population of neurons excited (but not inhibited) by the INC stimulus; however, this effect did not reach statistical significance (χ2 = 9.37, p = 0.15).
Separate populations of incentive stimulus-excited NAcC neurons show opposite modulations whether the animal engages or not in the task
Our data show that the neuronal representation of the INC stimulus plays a pivotal role in the decision to engage in reward-seeking. We aimed to decipher whether INC stimulus encoding could provide clues about the local circuit mechanism leading to this commitment to the task. We compared the activity of NAcC neurons between INC stimuli that evoked reward-seeking and those that did not. We restricted this analysis to sessions that contained at least 10 INC stimuli the animals did not respond to (later called “unattended INC stimuli”, 336 neurons, 55 sessions, 10 rats) and used a higher time resolution of 2 ms. The NAc contains a large majority of medium spiny projection neurons (MSNs) and several types of interneurons, including high-firing GABA interneurons (HFINs) and ACh interneurons (AChINs). We identified these populations using the basal firing rate and the coefficient of variation 2 (Fig. 5A; Gage et al., 2010; Sharott et al., 2012; Atallah et al., 2014; Stalnaker et al., 2016).
Neuronal responses to the incentive stimulus depend on the willingness to engage in the task. A, Basal firing rate plotted against the coefficient variation 2. Colors indicate the cellular type. The colored bar on the right show the proportion of each cell type on the entire population of neurons (n = 336). B, Heatmaps showing incentive stimulus-evoked activity when the animals initiated a trial (top) and when they did not (bottom) for MOTIV+ (left) and MOTIV− (right) neurons. Neurons are sorted by the cell type and latency of the excitation. Right bars indicate the percentage of AChINs, HFINs, and MSNs. Note: the same neurons are plotted for attended (top) and unattended (bottom) trials. C, Average PSTH for MOTIV+ and MOTIV− neurons for attended (black) and unattended (red) INC stimuli. D, Onset and duration of INC stimulus-evoked responses in MOTIV+ and MOTIV− neurons. E, Top, Regression coefficient plotted against R2 of the regression. Colors indicate the cellular type and filled symbol label significantly correlated neurons. Bottom, Distribution of the regression coefficient in significantly correlated (black bars) and non-significantly correlated (gray bars) neurons.
As reported previously (Nicola et al., 2004b; Morrison et al., 2017), 81% (39/48 neurons) of INC stimulus-excited neurons were activated by INC stimuli only in trials in which the stimulus induced reward-seeking and not in those the animal did not respond to (Fig. 5B,C). Among this population (which we call MOTIV+ neurons), 69% were putative MSNs, and 31% were putative HFINs; these proportions did not differ from those in the entire population (χ2 = 2.85, p = 0.09 and χ2 = 3.23, p = 0.07, respectively). Onset latencies of excitations were similar for MSNs (85 ± 10 ms) and HFINs (129 ± 20 ms, Kolmogorov–Smirnov test, p = 0.14).
More surprising was the discovery of a small neuronal population (19%, 9/48) that responded in the opposite manner: these neurons were more excited to unattended INC stimuli (MOTIV− neurons; Fig. 5B,C). This population was highly enriched in putative AChINs (44%, significantly more than in the entire population, χ2 = 41.7, p < 0.0001). Of 9 putative AChINs recorded in the population, 4 displayed this pattern of activity in response to the INC stimulus. Three MOTIV− neurons were HFINs (a proportion similar to that found in the entire population, χ2 = 1.87, p = 0.17), and the remaining two were putative MSNs (under-represented compared with the entire population, χ2 = 18.9, p < 0.0001). Another striking difference in the activity of these neurons was the temporal dynamics of their responses. All nine MOTIV− neurons (coming from 3 different animals in 7 different sessions recorded at different depths) were excited at very short latency and for a very brief period of time on unattended INC stimuli (Fig. 5B–D; onset = 32 ± 14 ms, duration = 65 ± 3 ms). Furthermore, the onset latency tended to shorten on unattended INC stimuli compared with those attended. This effect was significant for AChINs (Kolmogorov–Smirnov test, p = 0.011), for which the onset of excitation in unattended trials clearly preceded that in responded trials (Fig. 5B–D). The excitation of MOTIV− neurons to the INC stimulus emerged and terminated before the excitation of MOTIV+ (Kolmogorov–Smirnov test, p = 0.0001 for MOTIV+ and MOTIV− onsets comparison and p = 0.01 for MOTIV− offset and MOTIV+ onset comparison).
In summary, when the animal responded to the INC stimulus, MOTIV+ neurons (MSNs and HFINs) were activated ∼85 ms after the appearance of the incentive stimulus, whereas MOTIV− neurons (AChINs and HFINs) displayed an earlier and transient excitation. When the animal did not respond to the INC stimulus, the excitation of MOTIV− neurons (AChINs and HFINs) was much greater than when the animal responded, whereas the response of MOTIV+ neurons was suppressed.
We next sought to determine whether, on trials in which animals responded to the INC stimulus, the evoked activity on individual trials covaried with the latency to engage in the task. We performed a linear regression of the evoked-stimulus response with the latency of the lever press on a trial-by-trial basis. For this analysis, we calculated the firing in time windows adapted to the response temporal dynamics of MOTIV+ and MOTIV− neurons (50–200 ms and 0–40 ms poststimulus, respectively). We found an overall negative relationship between the evoked activity and the latency for MOTIV+ neurons (Wilcoxon test, p = 6 × 10−5; Fig. 5E). Eighteen of the 39 MOTIV+ neurons (46%) showed a significant negative correlation, and only one neuron was positively correlated. There was no difference between the regression coefficients of HFINs and MSNs neurons (Wilcoxon test, p = 0.27). MOTIV− neurons displayed the opposite relationship to the behavioral response latency with an overall positive correlation between the evoked firing and the latency to lever press (Wilcoxon test, p = 0.004), with 5 of 9 neurons (55%) showing a significant correlation, including the four AChINs. These data illustrate the continuum in the activity of NAcC neurons between responding to the INC stimulus at longer latencies and not responding.
Selectivity of encoding for GO or NOGO stimuli in the NAcC
We investigated the selectivity of NAcC neurons to instructive stimuli. We compared the firing to GO and NOGO stimuli (0–500 ms post-stimuli, Wilcoxon test) in the neuronal population responding to these instructive stimuli (n = 44 and n = 85 for excited and inhibited neurons, respectively).
For excitations (Fig. 6A,B), 16 and 25% of neurons had a significant preference for GO and NOGO stimuli, respectively. GO-selective neurons displayed a sustained response that exceeded the GO stimulus period, whereas NOGO-selective neurons exhibited a brief response following the NOGO stimulus. The remaining neurons (59%) showed no significant preference for either stimulus and the average response was similar during the period considered. The distribution of the neuronal types across these populations of neurons did not differ from the entire population (χ2 tests, p = 0.22–0.99 for all comparisons).
Selectivity of GO and NOGO stimuli-evoked excitations. A, Heatmaps showing GO (top row) and NOGO (bottom row) responses for GO selective neurons (left), nonselective neurons (middle), or NOGO selective neurons (right). Neurons are sorted by the neuronal type (shown on the right colored bar as in Fig. 4) and the onset latency of the response. RLP represents the average time of the initiation lever press; En and Ex, average time of the entry and exit into the reward receptacle, respectively. B, Average PSTH aligned to the GO (green) and the NOGO (orange) stimulus for GO selective (left), nonselective (middle), and NOGO selective (right) neurons. C, Same as A for GO and NOGO stimuli-evoked inhibitions. D, Same as B for GO and NOGO stimuli-evoked inhibitions.
The response of inhibited neurons by instructive stimuli started before the presentation of the stimuli (Fig. 4). Yet, the presentation of instructive stimuli differentially affected the response of these neurons (Fig. 6C,D): 13 and 19% of neurons had a significant preference for GO and NOGO stimuli, respectively. HFINs were more represented in nonselective and NOGO-selective neurons than in the entire population (χ2 = 14.7, p = 0.0001 and χ2 = 4.47, p = 0.03, respectively).
Discussion
We aimed at comparing the underlying neuronal determinants of incentive and instructive stimuli in the same task and on the same neurons. We developed a task that reveals important differences in these two types of stimuli. The incentive stimulus invites the rat to engage in the task sequence. Once the rat has decided to initiate a trial, it remains engaged in the task until the end of the trial. Responding to the incentive stimulus depends on NAcC AMPA/NMDA and dopamine D1 receptors, but not the retrieval of the response associated with the instructive stimuli. NAcC neurons respond more to the incentive than the instructive stimuli and these populations are relatively independent from one another. The analysis of unattended trials reveals a novel population of neurons that is rapidly and strongly activated by the stimulus and highly enriched in AChINs. Overall, these results provide additional evidence (Singh et al., 2011) that the NAcC, though the integration of glutamate and dopamine signals is specially involved in deciding whether and when to engage in the task in response to incentive stimuli rather than deciding which action to perform in response to instructive stimuli.
The involvement of the NAcC in instrumental responding to incentive stimuli is consistent with previous work (Yun et al., 2004; Ambroggi et al., 2008, 2011). The weaker effect of dopamine and glutamate antagonists reported here can be due to the constant and relatively short ITI used. Indeed, Nicola (2010) reported that the effect of NAc dopamine blockade inversely scaled with the length of the ITI. The NAcC is proposed to participate in action-selection based on information carried by stimuli (Nicola, 2007; Yin et al., 2008; Humphries and Prescott, 2010; van der Meer and Redish, 2010; Khamassi and Humphries, 2012; Schultz, 2016). The reduced responding to incentive stimuli of NAcC inactivated rats can thus be interpreted as a failure to retrieve the necessary action in response to stimuli. However, the absence of effect on instructive stimuli responding enters in contradiction with this interpretation and highlights the fact that the NAcC is involved on processing a particular type of stimuli that triggers task engagement.
Yet, some studies reported that NAcC perturbations impaired response selection to instructive stimuli. Interestingly, deficits were evident in tasks where the outcome was uncertain, either because the stimulus-outcome (or stimulus-action) association was learned during the test session or because the delivery was probabilistic (Burton et al., 2014; Costa et al., 2016; Rothenhoefer et al., 2017; Sharpe et al., 2017; Piantadosi et al., 2018). In contrast, when the contingencies were constant throughout training and testing, NAcC perturbation had no effect on performance (Amalric and Koob, 1987; Cole and Robbins, 1989; Robbins et al., 1990; Giertler et al., 2004; Floresco et al., 2006; Castañé et al., 2010; Ghods-Sharifi and Floresco, 2010). In particular, blocking NAcC dopamine did not impede the ability of rats to discriminate two predicted stimuli associated with two rewards of different values (Hauber et al., 2000; Calaminus and Hauber, 2007; Stopper et al., 2013). This suggests that the expected value signal carried by dopamine neurons to the NAcC does not necessarily contribute to discriminating differently valued stimuli in simple situations where uncertainty is low (Floresco, 2015). Thus these data indicate that NAcC and its dopamine input are involved in action-selection during learning but not when it has been consolidated.
In tasks using incentive stimuli, there is now convincing evidence that dopamine drives learning and contributes to the motivation to engage in action (Steinberg et al., 2013; Keiflin and Janak, 2015; Chang et al., 2016; Berke, 2018; Saunders et al., 2018). As reported previously (McGinty et al., 2013; Morrison et al., 2017), we found that the magnitude of incentive stimuli excitations of NAcC projection neurons was inversely correlated with the behavioral latency. In a bandit task, NAcC dopamine release is also inversely correlated to the latency to engage in action and depended on the previous reward rate (Hamid et al., 2016; Mohebi et al., 2019), suggesting that dopamine could contribute to this encoding of motivational signals in NAcC neurons. The monotonic increase in behavioral latency throughout the session suggests that it also depends on satiation. We recently reported that the metabolic status of the animal modulated the firing of NAcC neurons through the action of orexin on paraventricular thalamic neurons that project to the NAcC. Hence, this circuit could also contribute to the modulation of NAcC with the motivational level (Meffre et al., 2019).
The weaker representation of the instructive stimulus compared with the incentive stimulus in the NAcC may depend on whether stimuli engage the animal in the task. Interestingly, incentive stimuli trigger larger NAcC dopamine release than instructive stimuli (Saddoris et al., 2015). Yet, other input regions to the NAcC could also contribute. For instance, stimulus predictability was found to reduce the evoked excitations of BLA neurons (Herry et al., 2007), suggesting that the expected instructive stimuli induce a lesser activation of BLA neurons. An additional inhibitory influence onto NAcC neurons could also play a role. Task engagement induces prolonged inhibitions of NAc neurons (Taha and Fields, 2006; Krause et al., 2010; Ambroggi et al., 2011), an effect that we observed here. Such an inhibitory signal triggered by the presentation of the incentive stimulus could dampen excitations driven by excitatory inputs, thus explaining the weakened excitations to instructive stimuli. We previously found evidence for such inhibitory masking signal in the NAc shell, where the infralimbic cortex prevented NAc neurons to respond to non-rewarding stimuli (Ghazizadeh et al., 2012).
Previous studies reported that NAcC neurons encode both the predictive value of instructive stimuli and their associated actions (Setlow et al., 2003; Roitman et al., 2005; Roesch et al., 2009; Goldstein et al., 2012; Bissonette et al., 2013; Strait et al., 2015; Sleezer et al., 2016). Our present data extend these findings by showing that instructive stimulus responses of NAcC neurons encode subsequent action and action restraint.
Given the large number of trials per sessions, rats often disengaged from the task for several consecutive trials. We observed that such trials represented periods in which the rats explored the cage, groomed or rested, indicating that they valued these activities more than engaging in the task. We speculated that an additional mechanism would come into play to prevent promoting signals carried by dopamine and BLA inputs (Nicola et al., 2004b; Ambroggi et al., 2008). We found that both ACh and GABA interneurons display an activity pattern compatible with this hypothesis. AChINs inhibit MSNs (Witten et al., 2010) via nicotinic receptors expressed by NPY-NGF GABA interneurons (English et al., 2012) and modulate dopamine release (Collins et al., 2016). Either or both of these mechanisms could suppress the response of MOTIV+ neurons on trials the animals were not willing to respond. In agreement with this hypothesis, blocking NAcC nicotinic transmission was shown to increase Pavlovian-conditioned approach (Wright et al., 2013) and Pavlovian-instrumental transfer (Collins et al., 2016, 2019).
Almost half of the recorded AChINs displayed the well described tri-phasic response with an excitation followed by an inhibition and a rebound excitation, as found in primate tonically active neurons (Aosaki et al., 1994; Morris et al., 2004; Apicella, 2007, 2017; Doig et al., 2014). Most studies on striatal AChINs focused on the inhibitory or rebound component, showing that they encoded stimulus value. However, experiments were conducted in highly motivated primates and reported that the initial excitation was variable and often absent (Doig et al., 2014). Our data indicate that the absence of behavioral responding to a stimulus is essential to strongly activate AChIN. Yet, the absence of responding is governed by multiple cognitive processes that involve different neuronal circuits. In the DS task, rats learned to extinguish responding to non-rewarding stimuli during training (Ghazizadeh et al., 2012) and no evidence of a MOTIV− profile was found in neurons responding to the non-rewarding stimulus (Nicola et al., 2004b; Ambroggi et al., 2011). Thus, it seems that MOTIV− neurons are recruited when rats decide not to respond to stimuli that are nonetheless associated with rewards.
A remarkable aspect of MOTIV− neurons is the rapidity and the brevity of the responses to stimuli, that nonetheless are strongly correlated with the engagement in the task in the next 10 s. This timing suggests that the decision to engage or not in reward-seeking is predetermined before the appearance of the stimulus and provides support to the hypothesis that this system has attentional properties (Floresco et al., 2006; Smith et al., 2011). Intriguingly, NAcC excitations to stimuli are magnified by the proximity of the rat to the lever (McGinty et al., 2013; Morrison and Nicola, 2014; Morrison et al., 2017). This encoding could reflect the attention of the rats toward the task-relevant stimuli and depend on the level of activation of AChIN neurons. These results call for further studies investigating whether incentive stimulus responses of AChINs are preconditioned by a tonic process signaling the internal state of the animal and its motivation to engage in other behaviors.
Footnotes
This work was supported by the ATIP-Avenir program, the Wheeler Center for Neurobiology of Addiction and the European Union's Horizon 2020 research and innovation program under Grant 767092 to F.A., and from the French Ministry of Education and Research to M.S. We thank Saleem M. Nicola for critical commenting on the paper; Bruno Poucet, Thierry Hasbroucq, and Boris Burle for helpful discussions; Dany Paleressompoulle for technical support; and Howard L. Fields for his incommensurable support and mentoring.
The authors declare no competing financial interests.
- Correspondence should be addressed to Frederic Ambroggi at frederic.ambroggi{at}univ-amu.fr