Abstract
An essential aspect of goal-directed decision-making is selecting actions based on anticipated consequences, a process that involves the orbitofrontal cortex (OFC) and potentially, the plasticity of dendritic spines in this region. To investigate this possibility, we trained male and female mice to nose poke for food reinforcers, or we delivered the same number of food reinforcers non-contingently to separate mice. We then decreased the likelihood of reinforcement for trained mice, requiring them to modify action–outcome expectations. In a separate experiment, we blocked action–outcome updating via chemogenetic inactivation of the OFC. In both cases, successfully selecting actions based on their likely consequences was associated with fewer immature, thin-shaped dendritic spines and a greater proportion of mature, mushroom-shaped spines in the ventrolateral OFC. This pattern was distinct from spine loss associated with aging, and we identified no effects on hippocampal CA1 neurons. Given that the OFC is involved in prospective calculations of likely outcomes, even when they are not observable, constraining spinogenesis while preserving mature spines may be important for solidifying durable expectations. To investigate causal relationships, we inhibited the RNA-binding protein fragile X mental retardation protein (encoded by Fmr1), which constrains dendritic spine turnover. Ventrolateral OFC-selective Fmr1 knockdown recapitulated the behavioral effects of inducible OFC inactivation (and lesions; also shown here), impairing action–outcome conditioning, and caused dendritic spine excess. Our findings suggest that a proper balance of dendritic spine plasticity within the OFC is necessary for one's ability to select actions based on anticipated consequences.
SIGNIFICANCE STATEMENT Navigating a changing environment requires associating actions with their likely outcomes and updating these associations when they change. Dendritic spine plasticity is likely involved, yet relationships are unconfirmed. Using behavioral, chemogenetic, and viral-mediated gene silencing strategies and high-resolution microscopy, we find that modifying action–outcome expectations is associated with fewer immature spines and a greater proportion of mature spines in the ventrolateral orbitofrontal cortex (OFC). Given that the OFC is involved in prospectively calculating the likely outcomes of one's behavior, even when they are not observable, constraining spinogenesis while preserving mature spines may be important for maintaining durable expectations.
Introduction
Navigating a changing environment requires associating actions with their outcomes, modifying these associations when they change, and making predictions about the consequences of one's behavior. Action–outcome-based decision-making likely depends on coordinated corticostriatal regions including specific compartments of the medial prefrontal cortex (Rudebeck et al., 2008; Gourley and Taylor, 2016; Hart et al., 2018), and also orbitofrontal cortex (OFC). The OFC is a large brain structure that is conceptualized as building a cognitive map of “task spaces”, allowing organisms to link behaviors and stimuli with anticipated outcomes, even when these associations are not readily observable (Wilson et al., 2014; Stalnaker et al., 2015). Another function ascribed to the OFC, which is not mutually exclusive, is updating expectations when familiar contingencies change (Sul et al., 2010; Fiuzat et al., 2017). Whereas the lateral-most regions of the OFC may specialize in stimulus–outcome representations (Rudebeck et al., 2008), experiments using lesion and inactivation strategies across rodent and primate species suggest that ventral and medial subregions are involved in modifying and solidifying action–outcome expectations (Gourley et al., 2013a, 2016; Gremel and Costa, 2013; Bradfield et al., 2015; Jackson et al., 2016; Fiuzat et al., 2017; Zimmermann et al., 2017, 2018). Although the precise computational strategies by which the OFC coordinates prospective decision-making are debated (Sul et al., 2010; Riceberg and Shapiro, 2017; Stalnaker et al., 2018), animals need the ability to update information about optimal response strategies, without which they may instead defer to familiar, inflexible, habit-based response strategies.
Several types of behavioral plasticity are associated with changes in dendritic spines, the primary sites of excitatory synapses in the brain. For example, many forms of learning and memory are accompanied by dendritic spinogenesis (Moser et al., 1994; Leuner et al., 2003; Restivo et al., 2009; Vetere et al., 2011a,b; Bock et al., 2014; Kuhlman et al., 2014; Nishiyama, 2014; González-Tapia et al., 2015, 2016; Mahmmoud et al., 2015; Jasinska et al., 2016; Ma et al., 2016) or spine elimination (Vetere et al., 2011b; Sanders et al., 2012; Jasinska et al., 2016; Ma et al., 2016; Swanson et al., 2017). Spine plasticity is also associated with proficiency of certain motor tasks (Fu et al., 2012; Liston et al., 2013; Hayashi-Takagi et al., 2015; Gonzalez-Tapia et al., 2016) and potentially, action–outcome expectation, given that drugs that enhance action–outcome learning can trigger spine elimination in certain brain regions (Swanson et al., 2017). Additional pharmacological investigations revealed that action–outcome-based decision-making was associated with dendritic spines containing large heads in the OFC (DePoy et al., 2016; Sharp et al., 2017). This pattern is significant because many experience-elicited synapses are transient, but a fraction associated with spine head enlargement is durably maintained (Holtmaat et al., 2005). Whether spine plasticity in the OFC is associated with action–outcome expectancies under naturalistic (drug-free) circumstances is unclear.
Dendritic spines can be classified by their shape, which corresponds with their function. Mushroom-shaped spines contain large, bulbous heads, and are considered mature and synapse containing, whereas thin-type spines are transient extensions with the potential for synapse formation (Bourne and Harris, 2007). We find that action–outcome conditioning eliminates thin-type dendritic spines in the ventrolateral OFC, resulting in a larger proportion of spines that are mushroom shaped. Given that the OFC is involved in prospective calculations of likely outcomes, even when they are not observable (Wilson et al., 2014), constraint on thin spines may be important for establishing and maintaining durable expectations. Directly linking cell structure with behavior has historically been challenging, however, because of limited means for manipulating structural plasticity in vivo. Here we also selectivity reduced the RNA-binding protein fragile X mental retardation protein (FMRP), an endogenous inhibitor of dendritic spine turnover (Pan et al., 2010), in the ventrolateral OFC. FMRP deficiency caused dendritic spine excess and impeded action–outcome conditioning, suggesting that proper regulation of dendritic spine plasticity within the OFC optimizes an organism's ability to select actions based on expected consequences.
Materials and Methods
Subjects
Subjects were male and female C57BL/6 mice bred in-house from Jackson Laboratories stock. When the sexes differed, they are represented separately. Dendritic spine imaging was accomplished using mice expressing Thy1-driven yellow fluorescent protein (YFP; Feng et al., 2000; H line) back-crossed onto a C57BL/6 background. Mice (total n = 142) were maintained on a 12 h light cycle (07:00 on) and provided food and water ad libitum except during instrumental conditioning when body weights were reduced to 90–93% of baseline to motivate responding. Mice were 6–10 weeks old at the start of the experiments except: (1) in the case of viral vector infusions, a “young” infusion group was included and received infusions at postnatal day (P)31; and (2) in one dendritic spine imaging experiment, an intact group of mice aged ∼8 months old was included. All procedures were Emory University IACUC approved.
Behavioral testing
Instrumental response training and action–outcome contingency degradation.
Mice were trained to nose poke for food reinforcement (20 mg grain-based pellets; Bioserv) using illuminated Med-Associates conditioning chambers equipped with multiple nose-poke recesses and a food delivery magazine. Initially, mice were trained using a fixed ratio 1 (FR1) schedule; 30 pellets were available for responding on each of two active nose-poke apertures, resulting in 60 pellets/session. The sessions ended at 135 min or when mice acquired all 60 pellets in our initial experiments (Fig. 1). For expediency, the sessions ended at 70 min or when mice acquired all 60 pellets in our subsequent experiments. Mice required between 5 and 17 daily training sessions to initially acquire all 60 pellets within the allotted time. Response acquisition curves represent both responses/minute during the final five sessions unless otherwise noted, and throughout, we detected no response biases that would otherwise impact our findings.
Instrumental contingency degradation can be used to assess whether mice select actions according to anticipated consequences (Balleine and O'Doherty, 2010). On one day, one nose-poke aperture was occluded, and reinforcers were delivered into the magazine independent of animals' interactions with the remaining available aperture. Instead, pellets were delivered for 25 min at a rate that was matched to each animal's individual reinforcement rate from the previous session. This procedure “degrades” the predictive relationship between actions and their outcomes. In another session, only the opposite aperture was available, and responding was reinforced, as during training, thus maintaining the predictive relationship between that response and the associated outcome. The order of these sessions and the location of the “degraded” aperture were counterbalanced.
To determine whether mice formed or updated action–outcome associations, both apertures were subsequently available during a 10–15 min probe test conducted in extinction. A goal-directed response strategy is to preferentially engage the action that is likely to be reinforced, whereas a failure to differentiate between the degraded and non-degraded relationships reflects a failure in action–outcome conditioning.
In two experiments, sensitivity to action–outcome contingency degradation was tested multiple times using a within-subjects experimental design, following the model of Dias-Ferreira et al. (2009) and others. After the first test, mice were further trained using either an FR1 or a random interval (RI) 30 s schedule of reinforcement, as indicated. Both nose-poke responses were reinforced, and sessions ended when the mice acquired 60 pellets, or at 70 min. Following a second test for sensitivity to instrumental contingency degradation, we trained mice further using an RI60-s schedule for four sessions, a protocol that biases typical mice toward engaging in stimulus–response habits rather than goal-directed actions (Swanson et al., 2015). Then, the instrumental contingency degradation procedure was repeated. The location of the degraded aperture was always opposite that in the previous test.
Non-contingent pellet delivery.
Each mouse subject to non-contingent pellet delivery was paired with another mouse that was behaviorally tested as described above. The “non-contingent” mouse was food-restricted and placed in a conditioning chamber daily for the same duration as its pair; however, in this condition pellets were delivered non-contingently at the same rate as acquired by the yoked animal, and responses on the nose-poke apertures had no programmed consequences.
Surgical and histological procedures
OFC lesions.
Mice were anesthetized with 1:1 2-methyl-2-butanol and tribromoethanol (Sigma-Aldrich) diluted 40-fold with saline and placed in a digitized stereotaxic frame (Stoelting). The scalp was incised, skin retracted, bregma and lambda identified, the head leveled, and coordinates located. NMDA (20 mg/ml; in saline) was delivered at AP: +1.7, ML: ±0.25, DV: −3.0 and AP: +2.6, ML: ±1.2, DV: −2.8 to generate cell-body lesions. The infusion volume was 0.1 μl per infusion delivered over the course of 1 min, and needles were left in place for 2 additional minutes before withdrawal and suture. Mice were allowed ≥1 week to recover before testing. Following testing, mice were killed by rapid decapitation. Brains were extracted, submerged in chilled 4% paraformaldehyde for 48 h, and then transferred to chilled 30% w/v sucrose. Brains were sectioned into 40 μm coronal sections. Infusion sites were then characterized by immunostaining for glial fibrillary acidic protein as previously described (Gourley et al., 2010). Three mice were excluded because lesions extended significantly into M2.
OFC-targeted hM4Di-DREADD delivery.
Mice were anesthetized with 80 mg/kg ketamine/0.5 mg/kg Dexdormitor and placed in a digitized stereotaxic frame (Stoelting). The scalp was incised, skin retracted, bregma and lambda identified, the head leveled, and coordinates located. Adeno-associated viruses [(AAV5)-CaMKII-hM4Di-(Gi)-mCherry or AAV5-CaMKII-mCherry, generated by Roth (for review, see Urban and Roth, 2015) and the University of North Carolina Viral Vector Core] were delivered bilaterally into the OFC (0.5 μl/infusion over the course of 5 min; coordinates: AP: +2.6, ML: ±1.2, DV: −2.8). Needles were left in place for 5 additional minutes before withdrawal and suture. Mice were allowed ≥2 weeks to recover before testing. One hour following the final test session, mice were killed by rapid decapitation. Brains were extracted, submerged in chilled 4% paraformaldehyde for 48 h, and then transferred to chilled 30% w/v sucrose. Brains were later sectioned into 40 μm coronal sections. Infusion sites were characterized by imaging mCherry. Three mice were excluded for lack of bilateral mCherry expression.
OFC-targeted Fmr1 knockdown.
Lentiviral vectors were created by the Emory University Viral Vector Core and expressed either a short-hairpin RNA directed against Fmr1 (shFmr1) and mCherry or a scrambled construct and mCherry, under the H1 promoter (for production details, see Gross et al., 2015). In prior investigations, we confirmed that the same stock of shFmr1 lentiviruses decreases FMRP protein and increases phosphorylated mTOR in mouse prefrontal cortex, as would be expected (Gross et al., 2015).
Mice were anesthetized with 80 mg/kg ketamine/0.5 mg/kg Dexdormitor and placed in a digitized stereotaxic frame (Stoelting). The scalp was incised, skin retracted, bregma and lambda identified, the head leveled, and coordinates located. Viral vectors were infused over 2.5 min in a volume of 0.25 μl/hemisphere at AP: +2.6, ML: ±1.2, DV: −2.8. Needles were left in place for 5 additional minutes before withdrawal and suturing. Experiments were initiated 25 d later to allow time for viral vector expression.
Throughout, infusions were bilateral. For behavioral experiments, mice received a single viral vector (shFmr1 or scrambled control) in both cerebral hemispheres. Mice were infused at P31 (young infusion) or for comparison, 56 (“adult” infusion). For dendritic spine imaging studies, one hemisphere received the shFmr1-expressing viral vector, and the opposite hemisphere received the scrambled control construct. Left versus right hemispheres were counterbalanced, and mice were 31 d old.
All mice were killed by rapid decapitation. Brains were extracted, submerged in chilled 4% paraformaldehyde for 48 h, and then transferred to chilled 30% w/v sucrose. Infusion sites were characterized by imaging mCherry in 40-μm-thick coronal sections.
CNO delivery (dosing and timing)
In experiments using hM4Di-DREADDs, mice received 1 mg/kg clozapine-N-oxide (CNO; Sigma-Aldrich) dissolved in 2% dimethylsulfoxide and saline (1 ml/100 g, i.p.). CNO was delivered immediately following the instrumental contingency degradation procedure, a time point selected based on experiments revealing that post-training OFC inactivation causes failures in action–outcome updating and thus, failures in goal-oriented action selection the following day (even though the OFC is back “on-line”; Zimmermann et al., 2018). CNO was delivered to all mice, to expose all animals to any unintended consequences of CNO, e.g., conversion to clozapine (Gomez et al., 2017). Importantly, the probe test occurred 24 h following injection, when the drug would no longer be expected to be on-board (Gomez et al., 2017).
In a second experiment using intact, virus-naive mice, CNO or vehicle was delivered identically as above to confirm that our relatively low dose of CNO (Urban and Roth, 2015) did not have unintended consequences.
Dendritic spine imaging and characterization
Dendritic spine imaging.
YFP-expressing mice were killed by rapid decapitation, and in the case of behavioral experiments, euthanasia occurred 1 h following the probe test. Brains were submerged in 4% paraformaldehyde for 48 h, then transferred to chilled 30% w/v sucrose, followed by sectioning into 40-μm-thick sections on a microtome held at −15°C ± 1. Unobstructed dendritic segments running parallel to the surface of the section were imaged on a spinning disk confocal (VisiTech International) on a Leica microscope. Z-stacks were collected with a 100× 1.4 NA objective using a 0.1 μm step size, sampling above and below the dendrite. Laser intensity was optimized and then held constant. Camera gain and exposure varied to ensure the brightest dendritic spines were within the dynamic range of the camera. After imaging, we confirmed at lower-magnification that the image was collected from the intended region, second-order or higher OFC dendrites localized within the lateral and ventral subregions of the OFC, and collected 50–150 μm from soma. For comparison, secondary apical dendrites located 150–250 μm from the somatic layer in dorsal hippocampal CA1 were also imaged. Dendrite images were acquired bilaterally from 5 to 8 independent neurons/mouse (except for one hM4Di-DREADD-expressing mouse, in which only 3 infected and 4 uninfected dendritic segments were collected because of unanticipated tissue damage). Importantly, experimenters were blinded to group.
Semiautomated dendrite and dendritic spine reconstruction.
3-D dendrite reconstructions were accomplished with the FilamentTracer module of Imaris (Bitplane) as described previously (Swanger et al., 2011; Gourley et al., 2013b): a dendritic segment 15–25 μm in length was drawn using the autodepth function. FilamentTracer processing algorithms centered the segment and determined dendrite diameter. Dendritic spines were detected with the autodepth function. Each spine was then reconstructed in 3-D using FilamentTracer algorithms. Dendritic spines were classified using established parameters (for hippocampal neurons, see Swanger et al., 2011; for prefrontal cortical neurons, see Radley et al., 2013). A single blinded individual processed all images within a single experiment.
Western blotting
Mice were briefly anesthetized by isoflurane and killed by decapitation following the probe test. Brains were frozen at −80°C, then sectioned at 1 mm. The ventrolateral OFC was dissected by a single experimenter using a 1 mm corer. Tissue was homogenized by sonication, and protein content was measured by Bradford colorimetric assay. Fifteen micrograms of protein/sample were separated by SDS-PAGE on a 4–20% gradient tris-glycine stain-free gel (Bio-Rad). Following transfer to PVDF membrane, membranes were blocked with 5% nonfat milk.
Primary antibodies were anti-phospho-ERK1/2 (Rb, Cell Signaling Technology, 9101s, lot 30; 1:1000), anti-ERK1/2 (Rb, Cell Signaling Technology, 9102s, lot 30; 1:2000), anti-phospho-cofilin (Ser3; Rb, Cell Signaling Technology, 3311s, lot 11; 1:250), and anti-cofilin (Rb, ECM Biosciences, CP1131, lot 2; 1:400). Membranes were incubated overnight and then in horseradish peroxidase-conjugated goat anti-rabbit (Vector Laboratories; 1:5000) secondary antibody. Immunoreactivity was assessed using a chemiluminescence substrate (Pierce) and measured using a ChemiDoc Imager (Bio-Rad). Phospho-signals were normalized to the corresponding total protein signals (which did not differ between groups), then to the control sample mean from the same membrane to control for variance between gels. Total protein was also measured by imaging the SDS-PAGE gel before transfer and did not differ between groups. Phospho-ERK1/2 was measured in two independent cohorts of mice. To the second cohort's membranes, we added anti-cofilin antibodies. All gels were run at least twice, with concordant outcomes.
Statistics
Nose-poke rates were compared by ANOVA, with response selection and group as factors, and with repeated measures when appropriate. In the case of significant interactions, post hoc comparisons were made with Tukey's tests, and results are indicated graphically. p < 0.05 was considered significant.
For western blot analyses, each mouse contributed a single value (each animal's mean value from multiple gels). Comparisons were made by two-tailed unpaired t tests. p < 0.05 was considered significant.
For dendritic spine analyses of intact and DREADDs groups, each mouse contributed a single spine density per subtype (reflecting the average of all dendrites from that mouse). The proportion of spines with a mushroom shape was calculated as follows: mushroom spines/all spines, on a per-dendrite basis. Comparisons were made by two-tailed unpaired t test, two-factor ANOVA (spine type × group), or three-factor ANOVA with repeated measures (spine type × virus type × infection status). Throughout, in the case of significant interactions, post hoc comparisons were made with Tukey's tests, and results are indicated graphically. p < 0.05 was considered significant.
For dendritic spine density and dendrite diameter analyses in shFmr1 experiments, each mouse contributed two density and diameter values (the average density and diameter from dendrites in the shFmr1 cerebral hemisphere versus average density and diameter from dendrites in the scrambled control hemisphere). Comparisons were made by two-tailed paired t test (shFmr1 hemisphere vs control hemisphere). p < 0.05 was considered significant.
Throughout, SigmaStat and SPSS were used, and group sizes were determined based on power analyses of preexisting datasets. n vales for each individual group are reported in the figure captions.
Results
Instrumental conditioning causes selective dendritic spine plasticity in the ventrolateral OFC
Goal-directed decision-making requires one to associate actions with their outcomes, and to modify these expectations when necessary. Experiments using lesions, chemogenetics, and optogenetics suggest that the ventrolateral OFC in mice is involved in this process (Gourley et al., 2013a; Gremel and Costa, 2013; Zimmermann et al., 2017, 2018; Baltz et al., 2018). To investigate whether forming or modifying action–outcome expectancies is associated with neuronal structural plasticity in the OFC, we trained mice expressing YFP in layer V cortical neurons to acquire food reinforcers (Fig. 1a). A separate group of mice (“no training”) was similarly food-restricted and placed daily in the operant conditioning chambers, but food pellets were delivered non-contingently at a rate paired to a mouse responding for food reinforcers. These mice were thus exposed to food restriction, handling, the testing chambers, and food pellets, but did not learn that nose poking was reinforced. Although all mice investigated the nose-poke apertures during the initial two training sessions regardless of group, nose poking in food-reinforced mice subsequently increased as expected, whereas exploration of the nose-poke recesses in the non-reinforced mice dropped, also as expected (interaction: F(4,44) = 6.4, p < 0.001; Fig. 1b). Meanwhile, both groups received equivalent amounts of food pellets (Fig. 1c).
Action–outcome conditioning triggers dendritic spine plasticity in the ventrolateral OFC. a, Task schematic: first, mice were trained to nose poke for food reinforcers, then the action–outcome contingency associated with one response was modified (degraded) by providing the associated food pellet non-contingently. Whether mice preferred the rewarded behavior, evidence of associating actions with their outcomes, was then measured in a probe test. b, Half of our mice were trained to nose poke for food reinforcers as indicated (“trained” group). Meanwhile, another group of mice, referred to as the no training group, was placed in the conditioning chambers, and pellets were delivered at a rate matched to a paired trained mouse. Nose poking had no consequences. All mice initially explored the nose-poke apertures, including no training mice, but only trained mice energized their response rates over multiple days. Rates reflect entries on both recesses. c, Despite different response patterns, groups received equivalent amounts of food pellets throughout. d, Mice that actively responded for food pellets were sensitive to instrumental contingency degradation, indicated by preferential responding during a probe test, whereas mice given non-contingent pellet access did not respond, as expected. e, Mice were killed 1 h following the probe test, revealing that the trained group had fewer thin-type dendritic spines in the ventrolateral OFC. f, Furthermore, the overall proportion of spines with a mature, mushroom shape was elevated in trained mice. Representative dendrites are adjacent. g, As a point of contrast, dendritic spine loss associated with aging was not specific to spine subtype. h, Further, in hippocampal CA1, the densities of stubby-, mushroom-, and thin-type spines did not differ between no training versus trained groups. Representative dendrites are adjacent. Bars and symbols indicate mean + SEM. *p < 0.05. Scale bars, 2 μm. Trained: n = 6; no training: n = 7; aged: n = 6.
We next provided the food pellet associated with one of the nose-poke responses non-contingently, degrading the action–outcome relationship (Fig. 1a, schematic). Subsequently, trained mice preferentially performed the other nose-poke response, evidence of updated action–outcome expectations. Again, the matched, no training group generated virtually no nose pokes at all, as expected (interaction: F(1,11) = 57.5, p < 0.001; Fig. 1d).
We killed mice 1 h following the probe test and imaged and enumerated dendritic spines on layer V neurons in the ventrolateral OFC, of interest because they receive inputs from subcortical structures involved in action–outcome decision-making, such as the basolateral amygdala (Gabbott et al., 2006). Trained mice had fewer thin-type spines (group × spine type interaction F(2,33) = 3.2, p = 0.05; Fig. 1e), resulting in a greater proportion of spines that had a mature, mushroom shape(t(98) = −2.07, p = 0.04; Fig. 1f).
To compare this pattern with another instance in which spine loss might be expected, we next compared dendrites from the trained group, which were ∼8 weeks old upon euthanasia, to dendrites from mice aged ∼8 months that had experienced the same testing procedure before euthanasia. Dendritic spine densities were lower in older mice, as has been observed in other regions of the rodent prefrontal cortex (for instance, compare young and middle-aged rats in the study by Bloss et al., 2011). Importantly, age-related spine loss was not selective to any particular spine type (main effect of age: F(1,27) = 50.67, p < 0.001; no spine type × group interaction: F(2,27) = 1.64, p = 0.21; Fig. 1g), also in agreement with prior investigations (Bloss et al., 2011). Thus, the pattern of dendritic spine loss associated with action–outcome conditioning (selective loss of thin spines) is distinct from spine loss associated with aging.
Next, we imaged dorsal hippocampal CA1 neurons in the ∼8-week-old trained versus non-trained mice, given that neurons in hippocampal CA1 can also be subject to learning-related dendritic spine elimination (in a fear conditioning procedure; Sanders et al., 2012). We found no differences in dendritic spine densities (no interaction or main effect of group: F < 1; Fig. 1h). Thus, action–outcome conditioning is associated with selective dendritic spine plasticity in the OFC, but not CA1 region of the hippocampus.
Associating actions with their outcomes is ventrolateral OFC-dependent
Next, we aimed to block action–outcome conditioning by inactivating the OFC. We first generated mice with excitotoxic, cell-body lesions of the OFC, encompassing the ventrolateral OFC, with some spread into the more lateral compartments (Fig. 2a). Response acquisition during training did not differ between groups (no interaction or main effects of group: F < 1; Fig. 2b). After five FR1 training sessions, lesion mice failed to respond selectively following instrumental contingency degradation (interaction: F(1,11) = 16.6, p = 0.002; Fig. 2c, Test 1). As a replication, we reinstated responding using two additional FR1 training sessions (Fig. 2b), then repeated the contingency degradation procedure. Again, OFC inactivation blocked the ability of mice to select actions based on the likelihood that they would be reinforced (interaction: F(1,11) = 6.1, p = 0.03; Fig. 2c, Test 2). Thus, the ventrolateral OFC appears necessary for selecting actions based on their probable consequences.
Associating actions with their outcomes is ventrolateral OFC-dependent. a, OFC lesions are represented on coronal sections from the Mouse Brain Library (Rosen et al., 2000), with black representing the largest infusion site and white the smallest. b, Mice acquired the nose-poke responses. Acquisition curves represent both responses/minute, and breaks in the response acquisition curve indicate tests for sensitivity to action–outcome contingency degradation. c, At these test points, control mice preferentially engaged the response that was most likely to be reinforced (non-degraded condition), evidence of associating actions with their outcomes. Meanwhile, mice with lesions were insensitive to action–outcome contingencies, generating both responses equivalently. Bars and symbols indicate mean + SEM. *p < 0.05, **p < 0.001 versus non-degraded following interaction effects. n.s., Nonsignificant. Control: n = 7; lesion: n = 6.
Dendritic spine plasticity associated with action–outcome conditioning is activity-dependent
To next generate a condition in which we could image dendritic spines following OFC inactivation, we infused CaMKII-driven viral vectors expressing either mCherry (controls) or hM4Di-DREADDs+mCherry into the OFC of Thy1-YFP-expressing mice (Fig. 3a, experimental timeline, b, neuron colabeled with mCherry and YFP). hM4Di-DREADDs allow for selective and controlled suppression of neural activity by systemic administration of CNO (Urban and Roth, 2015), including in the layer V OFC neurons of interest here (Zimmermann et al., 2018). Viral vectors primarily infected the ventrolateral OFC, but some spread into the lateral cortices was noted (Fig. 3c). These mice did not obviously differ from those with infection restricted to the ventrolateral OFC.
Dendritic spine plasticity in the ventrolateral OFC because of action–outcome updating is activity-dependent. a, Experimental timeline: following OFC-targeted infusion of either AAV5-CaMKII-hM4Di-DREADD-mCherry or AAV5-CaMKII-mCherry alone (control condition), mice underwent instrumental conditioning. Immediately following an action–outcome contingency degradation procedure, all mice were treated with 1 mg/kg CNO. Response preference was tested 24 h later, when mice were drug-free. Mice were killed for dendritic spine imaging 1 h following this test. b, A representative image of mCherry fluorescence in YFP-expressing mice is overlaid on a coronal section from the Mouse Brain Library (Rosen et al., 2000). Inset, High-magnification image of a colabeled neuron. Left, Subregions of the OFC are demarcated [agranular insula (AI); lateral OFC (LO); ventral OFC (VO)]. As in Figure 1, dendrites from the ventral and lateral OFC were imaged. c, Viral vector spread throughout the OFC is summarized, with lighter shading indicating the smallest documented spread and darker shading indicating the largest. d, Groups did not differ in nose-poke response acquisition. Acquisition curves represent the final five training sessions and both responses/min. e, Control mice were sensitive to action–outcome associations, as indicated by preferential responding following an instrumental contingency degradation procedure. Meanwhile, OFC inactivation (hM4Di-DREADD group) blocked action–outcome updating (n = 8 control viral vector, n = 6 hM4Di-DREADD). f, hM4Di-DREADD+ neurons had more thin-type dendritic spines than all other groups, including uninfected neurons from the same mice. g, In addition, hM4Di-DREADD+ neurons had lower proportions of mature, mushroom-type spines. Representative dendrites are adjacent. h, In a control experiment, we addressed concerns that 1 mg/kg CNO might be having unintended consequences. Intact naive mice were trained to nose poke for food reinforcers, then vehicle or CNO was paired with instrumental contingency degradation. Groups were assigned by matching response rates during training. Response acquisition curves represent the final five training sessions and both responses/min (n = 12 vehicle, n = 13 CNO). i, CNO had no effects on subsequent response preference or (j) phospho-ERK1/2 (commonly used as a marker of synaptic plasticity) or the cytoskeletal regulatory factor phospho-cofilin. Representative blots adjacent. Proteins were detected at the expected molecular weights (ERK1/2 at 42 and 44 kDa, and cofilin at 19 kDa). Bars and symbols indicate mean + SEM. *p < 0.05 following interaction effects, **p < 0.001 main effect of group. “p-” refers to phosphorylated. Scale bars, 3 μm; or as indicated.
Mice acquired the instrumental responses, with no group differences (no session × virus interaction: F(4,48) = 1.32, p = 0.3; no main effect of group: F <1; Fig. 3d). The hM4Di-DREADDs ligand CNO was delivered immediately following instrumental contingency degradation, during the presumed period of action–outcome memory updating and retention, which is ventrolateral OFC-dependent (Zimmermann et al., 2018). When response preferences were tested the next day, the previously inactivated group failed to respond in a selective fashion (interaction: F(1,12) = 4.8, p = 0.05; Fig. 3e). Thus, OFC neuroplasticity appears necessary for updating action–outcome associations that support optimal response strategies.
We killed mice 1 h later and imaged and enumerated dendritic spines on layer V neurons in the ventrolateral OFC. We compared multiple populations of dendrites: (1) dendrites expressing the mCherry control virus, (2) dendrites from control viral vector mice that were not infected, (3) dendrites expressing the hM4Di-DREADDs, and (4) dendrites from hM4Di-DREADDs-expressing mice that were not infected. Of these four groups of dendrites, hM4Di-DREADD-expressing dendrites had more immature, thin-type spines (spine type × virus type × infection status: F(2,44) = 4, p = 0.03; Fig. 3f) and a smaller proportion of spines with a mature, mushroom shape (virus type × infection status: F(1,161) = 6.5, p = 0.01; Fig. 3g). In other words, dendritic spine plasticity appeared to be blocked selectively in neurons that were inactivated. This pattern also suggested that viral vector infection did not itself impact dendritic spine densities. We argue that action–outcome updating triggers thin-type spine pruning, resulting in a greater proportion of mature, mushroom-shaped spines. When a neuron is inactivated, this spine elimination does not occur (findings summarized in Table 1).
Summary of dendritic spine modifications in the ventrolateral OFC
One notable difference from our experiments in Figure 1 was an apparent shift in baseline dendritic spine densities. Gradual, age-related spine loss could contribute to this observation (given time allotted for the viral vector to express; Fig. 1g), but exposure to surgery and/or the presumed DREADD ligand CNO could also conceivably be contributing factors. Given concerns regarding unintended consequences of CNO (namely, conversion to clozapine; Gomez et al., 2017), we tested whether the dose of CNO used here had any effects on sensitivity to instrumental contingency degradation. Intact, naive mice were trained to perform nose-poke responses (no effect of group: F(1,23) = 2.8, p = 0.11; no interaction: F < 1; Fig. 3h), then CNO or vehicle was delivered immediately following instrumental contingency degradation. Response preference the following day was comparable between groups (main effect of response: F(1,23) = 4.15, p < 0.001; no effect of CNO: F(1,23) = 1.1, p = 0.31, no interaction: F < 1; Fig. 3i).
We also found no effects of CNO on phosphorylation of the signaling factor ERK1/2 in the ventrolateral OFC of these mice (t(19) = 0.68, p = 0.51; Fig. 3j). Given that ERK1/2 phosphorylation is often considered a marker of synaptic plasticity, the lack of group difference suggests that CNO, at the dose and timing used here, does not obviously impact synaptic plasticity in the OFC of behaviorally-trained mice. Phospho-ERK1/2 was measured in two independent cohorts of mice. In the second cohort, we also measured phosphorylation of cofilin, given that the filament-severing actions of cofilin are controlled by phosphorylation at Ser3. Again, groups did not differ (t(7) = 0.81, p = 0.44; Fig. 3j), suggesting that the CNO dosing and timing used here did not grossly impact cytoskeletal plasticity.
The cytoskeletal regulatory protein FMRP is necessary for efficient expectancy updating
We hypothesized that the proper regulation of dendritic spine plasticity is necessary for OFC-dependent decision-making. To test this possibility, we reduced Fmr1, which encodes FMRP, an endogenous inhibitor of dendritic spine turnover (Pfeiffer and Huber, 2007; Pan et al., 2010; Pfeiffer et al., 2010). At P31 or P56, we infused into the OFC lentiviruses expressing mCherry + shFmr1 or a scrambled control construct (Fig. 4a). Despite weak fluorescence, mCherry accumulation along the infusion track served as histological confirmation that viral vectors targeted the ventrolateral OFC (Fig. 4b). The acquisition of food-reinforced nose-poke responses was unaffected by knockdown (F values < 1; Fig. 4c). When tested for sensitivity to action–outcome contingencies using instrumental contingency degradation, control mice preferentially generated the most highly reinforced behavior, evidence of updated action–outcome expectancies. By contrast, both male and female mice with Fmr1 knockdown beginning early in life (infusion at P31; referred to as young knockdown) failed to respond selectively (interaction: F(3,44) = 3.3, p = 0.03; Fig. 4d). Young knockdown females also responded less overall (Fig. 4d). Somewhat surprisingly, if knockdown was initiated in more mature mice (at P56), Fmr1 loss had no effects.
The cytoskeletal regulatory protein FMRP in the ventrolateral OFC is involved in associating actions with their outcomes. a, Experimental timeline: mice were infused with viral vectors at P31 or P56, and then tested 25 d later. b, mCherry-expressing shFmr1- or scrambled control viral vectors were infused into the ventrolateral OFC. mCherry accumulation along the infusion track is represented on coronal sections from the Mouse Brain Library (Rosen et al., 2000). c, Groups did not differ in response acquisition rates. Acquisition curves represent both responses/min. d, When we provided the food pellet associated with one of the responses non-contingently, control mice preferentially engaged the rewarded response (non-degraded condition). By contrast, Fmr1 knockdown in young (but not adult) mice caused nonselective responding, indicative of a failure in associating actions with their likely outcomes. Additionally, female Fmr1 knockdown mice responded less overall. Bars and symbols indicate mean + SEM. *p < 0.05 versus non-degraded and #p < 0.05 versus all other groups following interaction effects. n.s., Nonsignificant. Total control: n = 18; adult: n = 6; young males: n = 15; young females: n = 8.
We next asked: does Fmr1 knockdown delay or block decision-making based on action–consequence relationships? To address this question, we infused another group of mice with a scrambled construct or shFmr1 at P31 (Fig. 5a). Because of the sex difference identified above, we used only males. Again, Fmr1-deficient mice successfully acquired the instrumental responses (no interaction or main effects of group: F < 1; Fig. 5b), but failed to modify response strategies when one response no longer resulted in food reinforcement (group × response × session: F(2,39) = 3.4, p = 0.04; Fig. 5c). Even with four additional nose-poke training sessions and then another session in which one action–outcome contingency was degraded, knockdown mice again failed to differentiate between responses that were more or less likely to be reinforced (Fig. 5c). Finally, with the addition of another 4 training sessions and a final exposure to the action–outcome contingency degradation procedure, Fmr1-deficient mice were ultimately able to preferentially generate a response that was likely to be reinforced (Fig. 5c), indicating that adolescent-onset Fmr1 knockdown delays, but does not fully block, decision-making based on action–outcome contingencies. Notably, Fmr1 deficiency sufficiently delayed response optimization such that by the time Fmr1 knockdown mice used predictive relationships to guide their behavior, control mice had developed stimulus-response habits by virtue of extended training (Fig. 5c,d; Dickinson et al., 1983; Balleine and O'Doherty, 2010).
FMRP in the ventrolateral OFC is necessary for efficiently selecting actions based on their outcomes, and its deficiency causes dendritic spine excess. a, Experimental timeline: mice were infused with viral vectors at P31, then tested at P56. b, As in the prior figure, Fmr1 knockdown did not affect response rates during training, and acquisition curves represent both responses/min. Breaks in the acquisition curve represent tests for sensitivity to action–outcome contingencies. c, Also as in the prior figure, Fmr1 knockdown mice were unable to modify their response strategies when action–outcome contingencies were weakened (degraded condition). Indeed, mice with Fmr1 knockdown failed to engage outcome-oriented response strategies until after quite extensive training (Test 3), indicating that action–outcome conditioning was impaired, though not fully blocked. At this point, mice with a control viral vector had been so extensively trained that they had developed action–outcome-insensitive habits by virtue of protracted experience. d, The same data are represented as a ratio of non-degraded over degraded responses; the dashed line at 1 refers to chance levels of responding (i.e., nonselective responding), whereas ratios marked >1 reflect action–outcome-based response strategies. This comparison highlights delayed action–outcome conditioning with Fmr1 knockdown. n = 7 control, n = 8 knockdown. e, Dendritic spines were enumerated in separate mice bearing a scrambled control viral vector in one hemisphere and an shFmr1-expressing viral vector in the opposite hemisphere. Fmr1-deficient neurons had greater spine densities. f, Meanwhile, dendrite diameters did not differ between groups. Representative dendrites are adjacent. n = 9 mice; with one hemisphere expressing the control viral vector and in the opposite, Fmr1 knockdown. Bars and symbols indicate mean + SEM. *p < 0.05, **p < 0.001. n.s., Nonsignificant. Scale bars, 2 μm.
Even though Fmr1 is widely regarded as a key regulatory factor in dendritic spine plasticity, and previous behavioral and biochemical investigations have used viral vectors to selectively silence Fmr1 in the prefrontal cortex (discussed in detail below), the structural effects of postnatal-onset Fmr1 silencing are rarely investigated. Instead, the majority of investigations of dendritic spine structure and turnover use constitutive Fmr1−/− mice (in which case, Fmr1 is absent upon conception). Thus, we infused shFmr1-expressing or control scrambled viral vectors into the ventrolateral OFC of separate, behaviorally-naive YFP-expressing mice at P31 and quantified dendritic spine density on excitatory layer V OFC neurons at P56. Fmr1 knockdown increased dendritic spine density (t(8) = −2.469, p = 0.04; Fig. 5e), confirming the expected dendritic spine excess in Fmr1-deficient neurons. All spine types were affected; thus, total densities are shown.
We also asked: is postnatal-onset Fmr1 knockdown “damaging” neurons? One marker of structural damage is dendritic blebbing, causing the dendrite to enlarge. 3-D dendrite reconstruction revealed that Fmr1 knockdown did not alter dendrite diameter (t(8) = −0.37, p = 0.72; Fig. 5f), suggesting that neurons were not damaged.
Discussion
Choosing behaviors that are likely to be rewarded with desired outcomes is an essential aspect of day-to-day function that requires the OFC, a cortical structure considered relatively conserved across rodent-primate species (Wallis, 2011; Carlén, 2017; Izquierdo, 2017). Here we trained mice to generate two nose-poke responses (left side of chamber vs right) for food reinforcers, then we decreased the likelihood that one of the behaviors would be reinforced, requiring mice to update any action–outcome expectations that had formed. We discovered that mice that engaged in action–outcome decision-making had fewer immature, thin-type dendritic spines on excitatory pyramidal neurons within the ventrolateral OFC and a higher proportion of mature, mushroom-shaped spines. Given that the OFC is involved in prospective calculations of likely outcomes, even when they are not observable (Wilson et al., 2014; Stalnaker et al., 2015), “turning down” a neuron's propensity to form new spines may be important for maintaining durable expectations. To test this perspective, we reduced levels of the RNA-binding protein FMRP, which inhibits dendritic spine turnover [hence dendritic spine excess in FMRP-deficient mice and Fragile X syndrome (FXS), in which FMRP is lost; He and Portera-Cailliau, 2013]. FMRP deficiency caused dendritic spine overabundance and impeded action–outcome conditioning. In sum, we provide empirical evidence that flexible reward-related expectation requires dendritic spine plasticity in the OFC.
Dendritic spine subtypes and OFC subregions
Dendritic spine morphology reflects stages of spine formation, stabilization, and collapse (Bourne and Harris, 2007). Thin-type dendritic spines are considered transient extensions that can develop into more mature spines with a characteristic mushroom shape, or instead, be eliminated (Nimchinsky et al., 2002). Certain drugs that enhance action–outcome memory can trigger spine head enlargement in the OFC (Sharp et al., 2017). Meanwhile, OFC-targeted infusions of cytoskeletal-destabilizing compounds and genetic manipulations that cause PSD95 loss occlude such memory (Swanson et al., 2015; DePoy et al., 2017). To determine whether dendritic spine plasticity is associated with action–outcome expectancy under more naturalistic circumstances, we trained drug-naive mice to respond for two food reinforcers, then used an instrumental contingency degradation procedure that weakens one action–outcome association. Meanwhile, another group of mice was exposed to the same food restriction and handling procedures, but food was delivered non-contingently throughout all training and testing. Thin-type spines were sparser with action–outcome conditioning, resulting in greater overall proportions of mushroom-shaped spines. Constraining thin-type spines would limit the ability of new synapses to form, potentially allowing existing synaptic contacts to help maintain enduring expectations under uncertain circumstances.
Next, we inactivated the ventrolateral OFC immediately following the violation of familiar action–outcome contingencies. The OFC is thought to update and solidify action–outcome expectancies during this time, allowing for efficient responding when the mouse next encounters the testing chamber (Zimmermann et al., 2017, 2018). OFC inactivation caused mice to defer to familiar, habit-like behavior, as expected. Notably, effects were more rapid than in our prior reports (Zimmermann et al., 2017, 2018), potentially because of larger viral vector spread. Further, OFC inactivation blocked thin-type spine elimination, indicating that dendritic spine plasticity is associated with modifying learned associations, rather than the initial nose-poke training.
Across rodent and primate species, inactivation of the ventrolateral OFC interferes with organisms' ability to select actions according to their outcomes (see Introduction), including in marmosets performing a task very similar to that used here (Jackson et al., 2016). To corroborate prior reports, we also generated mice with cell-body lesions of the OFC, causing failures in action–consequence decision-making. We speculate that ventrolateral OFC inactivation disrupts interactions with the downstream ventrolateral and dorsomedial striatum (Schilman et al., 2008), striatal subregions necessary for action–outcome decision-making (Yin et al., 2008; Gourley et al., 2013a). In their absence, the dorsolateral striatum may instead control reward-seeking behaviors, resulting in automated, habit-based response strategies that are by definition insensitive to action–outcome contingencies (Yin et al., 2008; Balleine and O'Doherty, 2010).
Linking cell structure with function
More than two decades of investigation have linked modifications in neural structure with learned behaviors (see Introduction), but confirming causal relationships has historically been difficult because of limited means for manipulating structural plasticity in vivo. Here we site-selectively silenced Fmr1, given that a primary function of FMRP, encoded by Fmr1, includes inhibiting dendritic spine turnover and sustaining homeostatic dendritic spine and synapse stability (Pfeiffer and Huber, 2007; Cruz-Martín et al., 2010, Pan et al., 2010; Pfeiffer et al., 2010; He and Portera-Cailliau, 2013; Lauterborn et al., 2015). In the Fmr1−/− OFC, synaptic markers PSD95 and SAPAP3 and plasticity-regulated genes Arc/Arg3.1 and cFos are decreased (Krueger et al., 2011). Long-term potentiation (LTP), including experience-dependent LTP, is also impaired following FMRP loss (Larson et al., 2005; Zhao et al., 2005; Lauterborn et al., 2007; Hu et al., 2008; Chen et al., 2010; Cruz-Martín et al., 2010; Pan et al., 2010; Padmashri et al., 2013). As hypothesized, local knockdown here impeded the ability of mice to modify familiar actions based on reward likelihood.
In a separate experiment, we repeatedly exposed Fmr1-deficient mice to instrumental contingency degradation to determine whether, with repeated opportunity for new learning, these mice would ultimately be able to modify their response strategies based on action–outcome associations. Indeed, with experience, Fmr1-deficient mice ultimately inhibited nondiscriminate responding and preferentially selected actions based on outcome likelihood. By this time, however, control mice had developed habits because of extended training. Fmr1-deficient mice could ultimately “learn” the contingency degradation procedure presumably because other brain structures involved in action–outcome decision-making (for review, see Hart et al., 2014) were unaffected. Nevertheless, OFC-selective Fmr1 deficiency clearly delayed goal-oriented action selection. Such a deficiency could contribute to disability in FXS, in which case, some individuals display insistence on sameness, and change can be distressing.
Importantly, lentiviral-mediated gene silencing allowed us to manipulate primarily excitatory neurons (with modest glial and virtually no interneuron infection anticipated; Ehrengruber et al., 2001) selectively in the OFC. This approach is thus highly targeted, but one potential drawback is that the structural effects of postnatal-onset Fmr1 silencing have largely not been characterized, given the overwhelming utilization of Fmr1 knock-out mice in the field. Thus, we confirmed that postnatal viral-mediated Fmr1 silencing caused dendritic spine excess on OFC neurons, ours being the only investigation, to the best of our knowledge, to do so.
How, at an intracellular level, might FMRP be acting? One clue may relate to our finding that early-life (P31) knockdown caused behavioral abnormalities, while knockdown later in life did not. P31 is considered early adolescence in rodents (Spear, 2000), a period of maturation of cortical neurocircuits in both rodents (Green and McCormick, 2013; Caballero et al., 2016) and primates (Bourgeois et al., 1994; Huttenlocher and Dabholkar, 1997; Jacobs et al., 1997; Selemon, 2013). Pfeiffer and Huber (2007) suggest that FMRP's role as a translational inhibitor is important for effects on dendritic spines. For instance, they revealed that the transcription factor MEF2 is involved in FMRP-mediated synapse elimination (Pfeiffer et al., 2010). Relatively little is known regarding the transcripts regulated by MEF2 and FMRP that mediate these events. Dendritic spine production in early adolescence and elimination and stabilization in late adolescence involve Arg/Abl2 kinase (Gourley et al., 2009, 2012), which activates the substrate cortactin (Lapetina et al., 2009), abundant in the adolescent OFC (Shapiro et al., 2017). FMRP determines the synaptic localization of cortactin (Seese et al., 2012), so its deficiency could conceivably interfere with Arg activity during adolescence, which would interfere with OFC function (Gourley et al., 2009, 2012). One implication of this possibility is that OFC-specific Fmr1 knockdown later in life would have few, or no, consequences, and our behavioral experiments support this perspective, given that adult-onset knockdown had no effects.
The behavioral experiments we report here are generally consistent with evidence that Fmr1 deficiency throughout the prefrontal cortex or nonselective Fmr1 knock-out causes prefrontal cortical-dependent decision-making defects (Krueger et al., 2011; Gross et al., 2015, 2019; Siegel et al., 2017), yet certain distinctions are important. First, adult-onset knockdown of Fmr1 throughout the medial and lateral prefrontal cortex spares response flexibility in the presently-used task, but causes decision-making defects in other circumstances also requiring strategy modification (contingency reversal and extinction; Gross et al., 2015, 2019). Thus, FMRP in multiple prefrontal cortical structures, even in adulthood, appears necessary for inhibiting inappropriate or inefficient behaviors. Second, Fmr1-deficient males here generated higher response rates relative to females. Some evidence suggests that boys with FXS are more likely to be hyperactive (Rinehart et al., 2011); thus, hyperactivity in male shFmr1 mice could conceivably account for elevated response rates, but further research would be needed to empirically test this possibility.
Conclusions
Unifying frameworks for OFC function conceptualize it as linking behaviors and stimuli with anticipated outcomes, even when associations are not immediately observable (Wilson et al., 2014). Our findings suggest that updating action–outcome associations triggers the elimination of thin-type spines (generally considered immature) in the OFC, leaving a substantial proportion of spines with a mature, mushroom-shaped head. This plasticity may be important for maintaining durable expectations under uncertain circumstances. A key next step will be better understanding learning-related dendritic spine plasticity throughout corticostriatal circuits necessary for efficient goal seeking.
Footnotes
This work was supported by Children's Healthcare of Atlanta, the Emory Egleston Children's Research Center, NIH MH101477, MH117103, MH103748, GM008602, and GM000680. The Yerkes National Primate Research Center is supported by the Office of Research Infrastructure Programs/OD P51OD011132. The Emory Viral Vector Core is supported by an NINDS Core Facilities Grant, P30NS055077. We thank Nisha Raj, John Yamin, Hayley Arrowood, Aylet Allen, and Dr. Kelsey Zimmermann for important contributions.
The authors declare no competing financial interests.
- Correspondence should be addressed to Shannon L. Gourley at shannon.l.gourley{at}emory.edu