Abstract
The medial orbitofrontal cortex (mOFC) regulates a variety of cognitive functions, including refining action selection involving reward uncertainty. This region sends projections to numerous subcortical targets, including the ventral and dorsal striatum, yet how these corticostriatal circuits differentially regulate risk/reward decision-making is unknown. The present study examined the contribution of mOFC circuits linking the nucleus accumbens (NAc) and dorsomedial striatum (DMS) to risk/reward decision-making using pharmacological disconnections. Male rats were well trained on a probabilistic discounting task involving choice between small/certain or large/risky rewards, with the probability of obtaining the larger reward decreasing or increasing over a session. Disconnection of mOFC-striatal pathways was achieved using infusions of GABA agonists inactivating the mOFC in one hemisphere, combined with NAc or DMS inactivation in the contralateral or ipsilateral hemisphere. Perturbing mOFC → NAc circuits induced suboptimal, near-random patterns of choice that manifested as a flattening of the discounting curve. Animals were equally likely to stay or shift following rewarded/nonrewarded choices, suggesting this pathway mediates use of information about reward history to stabilize decision biases. In contrast, mOFC → DMS disconnection impaired adjustments in decision biases, causing opposing changes in risky choice depending on how probabilities varied over time. This was driven by alterations in lose-shift behavior, suggesting mOFC → DMS circuits track volatility in nonrewarded actions to adjust choice in accordance with changes in profitability. Thus, separate mOFC-striatal projection pathways regulate dissociable processes underlying decision-making, with mOFC → NAc circuits aiding in establishing and stabilizing tasks states and mOFC → DMS circuits facilitating transitions across states to promote flexible reward seeking.
SIGNIFICANCE STATEMENT The medial orbitofrontal cortex regulates a variety of goal-directed behaviors, yet the functional circuits through which it mediates higher order decision-making functions are unclear. The present study revealed that different mOFC projection pathways facilitate diverse aspects of decision-making involving risks and rewards by engaging separate networks of neurons that interface with distinct ventral and dorsal striatal targets. These findings clarify some of the normal functions of these corticostriatal pathways and may have implications for understanding how dysfunction in these circuits relate to certain psychiatric disorders.
Introduction
Optimal decision-making entailing integration of information about costs and rewards associated with different actions is mediated in part by the orbitofrontal cortex (OFC). The OFC can be partitioned into lateral versus medial subregions that are anatomically and functionally heterogeneous across species (Kringelbach and Rolls, 2004; Price, 2007; Hoover and Vertes, 2011; Zald et al., 2014; Heilbronner et al., 2016). Because the initial description by Bechara et al. (1994, 1996) that damage to the ventromedial prefrontal cortex (including the medial OFC (mOFC) disrupted decision-making on the Iowa Gambling task, studies examining the mOFC in humans have implicated this region in reward-related decision-making (Hornak et al., 2004; Tsuchida et al., 2010; Peters and D'Esposito, 2016; Noonan et al., 2017) potentially by encoding value representations (O'Doherty et al., 2001; Diekhof et al., 2012; Metereau and Dreher, 2015).
Preclinical studies in rodents have provided additional insight into mOFC function, implicating this region in goal-directed and flexible behavior, that is, actions that are sensitive to changes in outcome value. More specifically, the mOFC is necessary when optimal choice is dependent on previously learned, but not currently observable, action-outcome information (Noonan et al., 2012; Bradfield et al., 2015; Dalton et al., 2016). mOFC lesions/inactivations alter numerous aspects of reward-related behavior, including, but not limited to, retrieving representations of value (Bradfield et al., 2018; Malvaez et al., 2019), behavioral flexibility (Gourley et al., 2016; Hervig et al., 2020), and effort-related responding (Münster and Hauber, 2018). Of particular note, the mOFC guides action selection in situations involving reward uncertainty (Dalton et al., 2016; Hall-McMaster et al., 2017; Stolyarova and Izquierdo, 2017; Verharen et al., 2020). For example, different manipulations of the rat mOFC alter choice on a probabilistic discounting task, entailing choice between smaller certain rewards and larger ones delivered with varying probabilities. Optimal risk/reward decision-making requires integrating information about different reward magnitudes with choice-outcome history to estimate the relative utility associated with different options. mOFC inactivation uniformly increased risky choice and enhanced win-stay behavior, suggesting that the collective activity of neurons in this region may temper the urge to pursue high-risk/high-reward options (Stopper et al., 2014), similar to clinical findings in humans (Clark et al., 2008). On the other hand, blockade of mOFC dopamine D1 versus D2 receptors actually induces opposing changes in probabilistic discounting, reducing or increasing risky choice, respectively (Jenni et al., 2021). These findings raise the possibility that different populations of mOFC neurons regulate dissociable aspects of reward seeking, which may be distinguished by the functional connections they form with different subcortical regions.
The striatum is one key system the mOFC may interact with to guide decision-making. Unidirectional projections originating in the mOFC form ipsilateral and contralateral connections with the entire dorsal/ventral axis of the medial striatum (Hoover and Vertes, 2011; Hintiryan et al., 2016). Some mOFC cells target the dorsal or ventral striatum separately, but a subgroup of cells send collateral projections to both regions (Reynolds and Zahm, 2005) providing multiple circuits through which mOFC-dependent representations may influence action selection. The nucleus accumbens (NAc) integrates value-related signals processed by cortical and amygdalar networks to bias action selection for motivated behaviors (Mogenson et al., 1980; Britt et al., 2012; Floresco, 2015). The NAc mediates various forms of cost/benefit decision-making, with lesions/inactivations generally reducing preference for larger rewards associated with various costs (Cousins et al., 1996; Cardinal et al., 2001; Salamone et al., 2007; Ghods-Sharifi and Floresco, 2010) including reward uncertainty (Stopper and Floresco, 2011; St. Onge et al., 2012a; Floresco et al., 2018). Changes in mesoaccumbens dopamine efflux tracks risky choice (St. Onge et al., 2012b), and activity of NAc neurons is predictive of choices made (Saddoris et al., 2015; Zalocusky et al., 2016), suggesting it biases actions toward subjectively better options (Sugam et al., 2014). In comparison, it is well documented that the dorsomedial striatum (DMS) promotes goal-directed action (Yin et al., 2005, 2008) and cognitive flexibility (Ragozzino et al., 2002; Ragozzino, 2007; Castañé et al., 2010; Grospe et al., 2018). Interestingly, there are few studies examining the contribution of the DMS to cost/benefit decision-making, yet we have recently identified a role for this striatal compartment in facilitating flexible shifts in choice biases during probabilistic discounting (Schumacher et al., 2021).
Despite the rich connectivity between the mOFC and different compartments of the striatum, few studies have examined whether these regions interact to guide reward seeking (Wang et al., 2019), and none have directly compared the specific functional role of these different corticostriatal circuits. Given these considerations, the present study used pharmacological disconnections to elucidate how serial communication between the mOFC and the NAc or DMS may differentially influence reward seeking during risk/reward decision-making.
Materials and Method
Subjects
Adult male Long-Evans rats (Charles River Laboratories) weighing 225–275 g at the start of the experiment were group housed and provided access to water and food ad libitum on arrival. Female rats were not included in this study. Some studies have found that females tend to be more risk averse on probabilistic discounting tasks compared with males (Orsini and Setlow, 2017; Islas-Preciado et al., 2020). However, despite potential baseline differences, manipulations such as amphetamine treatment induces similar changes in behavior in both sexes. They were handled daily for 1 week, and then subsequently food was restricted to 85–90% of their free feeding weight. They were then fed 15–17 g of food at the end of each experimental day. Their weights were monitored daily, and their individual food intake was adjusted to maintain a steady but modest weight gain. The colony was maintained on a 12 h light/dark cycle, with lights on at 7:00 A.M. The rats underwent behavioral testing between 8:00 A.M. and 12:00 P.M. each day. All experiments were conducted in accordance with the Canadian Council on Animal Care guidelines regarding appropriate and ethical treatment of animals and were approved by the Animal Care Center at the University of British Columbia.
Apparatus
Behavioral testing was conducted in operant chambers (31 × 24 × 21 cm, Med Associates) enclosed in sound-attenuating boxes. The chambers were equipped with a fan that provided ventilation and masked extraneous noise. A single 100 mA house light illuminated the chambers, and each chamber was fitted with two retractable levers located on each side of a central food receptacle in which 45 mg sweetened food reward pellets (Bio-Serv) were delivered by a dispenser. Four infrared photobeams were mounted in the side of each chamber, and another photobeam was located in the food receptacle. Locomotor activity was indexed by the number of photobeam breaks that occurred during a session. All data were recorded by a computer connected to the chambers via an interface.
Lever pressing training
The initial training protocols described below were identical to those described in our previous studies (St. Onge et al., 2012a; Jenni et al., 2017). The day before exposure to the operant chamber, each rat was given ∼25 sugar reward pellets in their home cage to familiarize them with the reward. On the first day of training, two pellets were delivered into the food cup, and crushed pellets were sprinkled on an extended lever before the rat was placed into the chamber. On consecutive days, rats were trained under a fixed-ratio 1 schedule to a criterion of 50 presses in 30 min on one lever and then the other side (counterbalanced). They then progressed to a simplified version of the full task. These 90-trial sessions began with the levers retracted and the operant chamber in darkness. Every 40 s, a trial was initiated with the illumination of the house light and the insertion of one of the two levers into the chamber (randomized in pairs). Failure to respond on the lever within 10 s caused its retraction, the chamber to darken, and the trial was scored as an omission. A response within 10 s caused the lever to retract and the delivery of a single pellet with 50% probability. Rats were trained for ∼3–4 d to a criterion of 80 or more successful trials (<10 omissions). Immediately after the final session of retractable lever training, rats were tested for their side bias toward a particular lever, using a procedure described previously (Larkin et al., 2016). Previous studies from our laboratory showed that we could considerably reduce the number of training days by accounting for the innate side bias of the rat when designating the risky lever, compared with when we randomly assigned the location of the risky lever across subjects. On the following day, rats started training on the probabilistic discounting task, with the risky lever assigned to be the one opposite the animal's side bias.
Probabilistic discounting task
Each daily session consisted of 90 trials separated into five blocks of 18 trials and took 50 min to complete. Rats were trained 5–7 d/week. One lever was designated the large/risky lever and the other the small/certain lever, and this designation remained consistent throughout training. Each session began in darkness with both levers retracted. Trials began every 33 s with the illumination of the house light, then 2 s later, one or both levers were inserted into the chamber. Each of the five blocks consisted of eight forced-choice trials (where only one lever was presented, randomized in pairs, so there were four forced-choices on each lever), followed by 10 free-choice trials (Fig. 1A). If no response was made within 10 s of lever presentation, the levers retracted, and the chamber reverted to the intertrial state (omission). Selection of a lever caused its immediate retraction. A choice of the small/certain lever delivered one pellet with 100% probability. Choice of the large/risky lever delivered a four-pellet reward in a probabilistic manner that changed systematically across blocks of trials (100, 50, 25, 12.5, 6.25%; Fig. 1A). The actual probability of receiving the large reward was drawn from a set probability distribution, so that on any given day rats may not have experienced the exact probability assigned to that block; however, the actual probabilities averaged across an extended number of training sessions more closely approximated the set value. This includes the 12.5% and 6.25% blocks where, although rats were unlikely to get rewarded during the four risky forced-choice trials within those blocks, with overextended training they learned that the large reward was even less likely to be received in the 6.25% versus the 12.5% probability block.
We tested the effects of our manipulations in separate groups of rats trained on a variant of the task where the probability of obtaining the large reward was initially 100% at the start of the session and then decreased over blocks (100 → 6.25%, descending variant), whereas other groups were trained on a variant where the odds started poor and increased over blocks (6.25 → 100%; ascending variant). This was done to delineate whether an increase in risky choice observed in the descending condition reflected either a general increase in preference for larger uncertain rewards (as has been observed after mOFC inactivation (Stopper et al., 2014) or an impairment in adjusting decision biases as has been reported following inactivation of the prelimbic prefrontal cortex (PFC), and DMS (St. Onge and Floresco, 2010; Schumacher et al., 2021). In the latter case, one would expect to see reduced preference for the risky options in the ascending condition as blocks progressed.
Squads of rats were trained until they displayed stable choice behavior, determined by analyzing data from three consecutive sessions with a two-way repeated-measures ANOVA with day and trial block as factors. If there was no main effect of day and no day times block interaction (at p > 0.10) then choice behavior of the group was deemed stable. Note that this larger than conventional alpha level was to allow for a more conservative analysis when testing for a null effect that implies stable choice behavior. Two to 3 d later, rats were subjected to surgery. Rats in the different groups required between 16–23 d of training before displaying stable patterns of choice.
Reward magnitude discrimination
As we have done previously, we determined a priori that if a particular manipulation reduced risky choice on both variants of the discounting task, we would test the effect of that same manipulation on a separate group of animals trained on a reward magnitude discrimination task. This was to determine whether this effect was because of an impairment in discriminating between reward magnitudes associated with the two levers, or any other nonspecific deficits in motor coordination or motivation for sucrose. This task consisted of 48 trials partitioned into four blocks of two forced-choice and 10 free-choice trials (12 trials per block). The probabilistic nature of the task was removed so that a choice on the large reward lever delivered four pellets, and a choice on the other lever would deliver one pellet, both with 100% probability. After 8–10 d of training, and rats displayed a strong bias toward the larger reward, they then received microinfusion test days in the same fashion as the animals tested on the probabilistic discounting task. Because this task requires considerably fewer training sessions to achieve stable choice behavior compared with the discounting task, these rats were implanted with guide cannulae before behavioral training.
Pharmacological disconnection design and rationale
Classical disconnection designs have traditionally used asymmetrical unilateral lesions or inactivations of interconnected brain regions to identify components of a functional neural circuit (Everitt et al., 1991; Gaffan et al., 1993; Floresco et al., 1997; Floresco and Ghods-Sharifi, 2007; Bossert et al., 2012; Reiner et al., 2020). These designs are based on the assumption that information is transferred serially from one region to an output structure downstream, and these signals are transmitted in both sides of the brain in parallel. It further assumes that dysfunction results from blockade of neural activity at the origin of a pathway in one hemisphere and the termination of the efferent pathway in the contralateral hemisphere. Specifically, we administered unilateral infusions of a GABAA/B agonist cocktail into the mOFC, in combination with the contralateral NAc or DMS in separate groups of rats.
An additional assumption of disconnection designs is that ipsilateral inactivation of one or both structures in a circuit may not have a disruptive effect on behavior because the intact structures in the opposite hemisphere should be able to at least partially compensate for the unilateral disruption in function. We controlled for this in two ways. First, each experiment included an ipsilateral disconnection condition, where unilateral intra-mOFC and NAc or DMS inactivation were administered in the same hemisphere. Furthermore, all rats in these studies were retrained after their initial set of test days and then received unilateral infusions of saline, as well as unilateral inactivation of the mOFC, NAc, or DMS.
Surgery
Rats were provided food ad libitum for 2 d before surgery. Rats were initially given a sedative dose of ketamine and xylazine (50 and 4 mg/kg respectively), which served as an analgesic measure, and were subsequently maintained on isoflurane to achieve a surgical plane of anesthesia for the duration of the procedure. Rats were stereotaxically implanted with two sets of bilateral 23 gauge stainless steel guide cannula. Two different combinations of implantations were used, (1) mOFC [anteroposterior (AP), +4.5 mm; mediolateral (ML), ±0.7 mm from bregma; dorsoventral (DV), −3.3 mm from dura] and NAc (AP, +1.6 mm; ML, ±1.3 mm; DV, −6.3 mm) or (2) mOFC (same coordinates as above) and DMS (AP, +1.2; ML, ±1.5 mm; DV, −4.0 mm). The coordinates for the NAc were centered near the border of the shell/core subregions because the mOFC sends moderately dense projections to this region (Hoover and Vertes, 2011), and inactivation of this region of the NAc induces a larger effect on probabilistic discounting compared with inactivation of either subregion alone (Stopper and Floresco, 2011). Likewise, fibers from the mOFC were shown to distribute heavily to medial aspects of the caudate/putamen, including the DMS region targeted here (Hoover and Vertes, 2011). Cannulae were held in place with stainless steel screws and dental acrylic. Thirty gauge obdurators were inserted into the guide cannula and remained in place until infusions were performed. The animals were given a minimum of 1 week to recover from surgery before being retrained on the probabilistic discounting task for a minimum of seven training sessions and until a group reestablished stable patterns of choice behavior.
Drugs, microinfusion procedure, and experimental design
Once stable choice behavior was reestablished, animals received a mock infusion to familiarize them with the procedures. Obdurators were removed, injectors were placed inside the guide cannula for 2 min, but no infusion was administered.
One or 2 d following the mock infusion, animals received their first microinfusion test day. Drugs or saline were infused at a volume of 0.3 μl. Inactivations were induced using a solution containing the GABAB agonist baclofen (75 ng; Sigma-Aldrich), and the GABAA agonist muscimol (75 ng; Tocris Bioscience). Infusions were administered via 30 gauge injection cannulae that protruded 0.8 mm past the end of the guide cannulae, at a rate of 0.4 μl/min by a microsyringe pump, so that the infusion lasted 45 s. The injection cannulae remained in place for an additional 60 s to allow for diffusion.
Separate groups of rats received three counterbalanced infusions 10 min before behavioral testing on separate days (a within-subjects design). The order of treatment and hemispheres that received the ipsilateral/contralateral infusions were counterbalanced across animals. The three primary treatment conditions were (1) control condition, asymmetrical unilateral saline infusions into the mOFC and NAc/DMS; (2) ipsilateral condition, infusions of GABA agonists into one hemisphere of the mOFC and the NAc/DMS in the same hemisphere; and (3) disconnection condition, infusion of GABA agonists into the mOFC and contralateral NAc or DMS. After the first infusion test day, animals were retrained for 1–3 d until their choice behavior deviated by <10% from their preinfusion baseline, after which they received their second counterbalanced sequence of infusions, and this continued until each rat received the three primary treatments.
After the initial series of infusion tests were complete, rats from each group were retrained for 3–5 d before receiving another sequence of counterbalanced single unilateral infusions. These included unilateral infusions of GABA agonists into mOFC, NAc, or DMS, as well as unilateral saline infusions in these regions as a control. Given the asymmetrical nature of the infusions administered, no rat received >3 infusions in a particular brain site.
The reward magnitude discrimination experiment was conducted in a separate group of rats who underwent similar microinfusion procedures as described above. They received inactivations of the mOFC in combination with inactivation of the contralateral NAc, or saline infusion on separate days. Because of the opposing effects of mOFC-DMS disconnection on risky choice, it was unlikely that the effects of manipulation were driven by deficits in reward magnitude discrimination, so the effects disconnection of this circuit on this control task was not assessed.
Histology
After completion of all test days, rats were killed with a combination of isoflurane anesthesia followed by carbon dioxide chamber. Brains were fixed in a 4% formalin solution. Each brain was frozen and sliced in 50 μm sections, mounted, and stained with Cresyl Violet. Placements were located with reference to the neuroanatomical atlas of Paxinos and Watson (2005). Data from rats whose placements resided outside the borders of the mOFC, the NAc, or the DMS were removed from the analysis. The location of all acceptable infusion sites is shown in Figure 1B.
Data analysis
The primary dependent variable of interest was the proportion of choices of the large reward option, factoring out trial omissions. This was calculated in each block by dividing the number of choices of the large reward lever by the total number of trials in which the rats made a choice. Choice data were analyzed using a three-way between/within subjects ANOVA, with treatment and probability block as two within-subject factors, and task variant (ascending or descending odds) as a between subjects factor. In these analyses, a three-way interaction indicates that the effect of treatment on discounting differed between the descending and ascending task variants, so follow-up analyses compared these groups separately. In comparison, a lack of an interaction with the task variant factor implies that a manipulation induced similar effects regardless of the manner in which reward probabilities changed over a session. All follow-up multiple comparisons were made using an a priori Dunnett's test where appropriate. In these analyses, the main effect of trial block was always significant (p < 0.001) and is not discussed further.
If a treatment induced a significant alteration in choice behavior on the probabilistic discounting task, supplementary analyses were conducted to clarify whether these effects were attributable to changes in reward sensitivity (win-stay behavior) and/or negative-feedback sensitivity (lose-shift behavior). Each choice was analyzed according to the outcome of the preceding trial and expressed as a ratio. The win-stay score was calculated as a proportion of risky choices made following a receipt of the larger reward (a risky win) divided by the total number of larger rewards (wins) obtained. Lose-shift scores were calculated as the proportion of trials rats shifted to make a small/certain choice following a nonrewarded risky choice (risky loss) over the total number of nonrewarded choice trials (losses). These scores were analyzed together using a two-way ANOVA, with outcome type (win-stay or lose-shift), and treatment as two within-subject factors. Changes in win-stay/lose-shift behavior indexed changes in reward and negative feedback sensitivity, respectively. Note that this approach only examined how the most recent outcome of a choice affected subsequent choice, and thus only serves as a partial indicator of how our manipulations affected integration of the broader action/outcome reward history of the session to shape reward/negative feedback sensitivity. In addition, response latencies, locomotion, and the number of trial omissions were analyzed with one- or two-way repeated-measures ANOVAs as appropriate. For all tests comparing the behavioral effects of disconnection versus control treatments, alpha was set at p < 0.05.
Results
mOFC → NAc in risk/reward decision-making
Previously, we and others have reported that bilateral inactivation of the mOFC causes a uniform increase in risky choice during probabilistic discounting, whereas lesions/inactivation of the NAc reduces risky choice (Cardinal and Howes, 2005; Stopper and Floresco, 2011; Stopper et al., 2014). As such, it was unclear how serial communication within this circuit may influence probabilistic choice. To address this question, two separate squads of rats were trained on the descending (100 → 6.25%) and ascending (6.25 → 100%) variants of the task. They received counterbalanced infusions of a GABAA/B agonist cocktail (baclofen/muscimol) to inactivate one hemisphere of the mOFC, combined with either the same (ipsilateral disconnection) or contralateral (functional disconnection) hemisphere of the NAc, or asymmetrical saline infusions as a control. Data from 21 rats (n = 11 descending, n = 10 ascending variant) with acceptable placements in both regions were included in the analysis (Fig. 1B). Four rats were excluded because of one or more cannula placements residing outside the defined borders of the mOFC or the NAc. This experiment was run in two separate cohorts, and all effects replicated similarly across both groups.
Disconnection of mOFC-NAc circuitry resulted in a stark flattening of the discounting curve and induced suboptimal patterns of choice across the different probability blocks (Fig. 2A, left). Choice data from all rats were analyzed together with a three-way, between/within subjects ANOVA, with task variant (descending or ascending) as a between-subjects factor. This primary analysis produced a significant main effect of treatment (F(2,38) = 5.59, p = 0.007) but more importantly, a treatment by block interaction (F(8,152) = 4.21, p = 0.003). Notably, there was no significant main effect of task (F(1,19 ) = 2.30, p = 0.15) or three-way interaction with the task factor (F(8,152) = 1.34, p = 0.22), indicating mOFC-NAc disconnection had a similar effect on choice regardless of the manner in which reward probabilities changed over the session (100 → 6.25% or 6.25 → 100%; Fig. 2A, right insets). Simple main effects analyses partitioning this treatment by block interaction compared risky choice in each probability block, collapsed across the task factor. These comparisons confirmed the impression made by visual inspection of the data from Figure 2A that mOFC → NAc disconnection differentially altered choice in a manner dependent on the probability block. Specifically, both asymmetrical and ipsilateral disconnections caused a significant reduction in risky choice compared with control treatments during the 100, 50, and 25% blocks, whereas the asymmetrical disconnection also increased risky choice during the 12.5% block (Dunnett's test, p < 0.05), although this effect was driven by animals trained on the descending variant. In this regard, the analysis also yielded a treatment by task interaction (F(2,38) = 5.74, p = 0.007). This reflected the fact that for rats trained on the descending variant, the proportion of risky choices averaged across the entire session did not differ across treatments (F(2,20) = 0.71, p > 0.50), whereas for those tested on the ascending variant, mOFC → NAc disconnections led to fewer risky choices averaged over the session compared with control treatments (F(2,18) = 10.39, p = 0.001).
Subsequent analyses probed how these impairments in probabilistic decision-making were related to alterations in how the outcome of a risky choice influenced the next choice. Under control conditions, animals showed a strong bias toward repeating a rewarded risky choice with another risky choice (win-stay behavior; one-sample t test vs 50%, t(20) = 12.60, p < 0.001), whereas they demonstrated lose-shift behavior on fewer than 50% of trials (one-sample t test vs 50%, t(20) = 2.93, p = 0.008), when collapsed across tasks and blocks. Disconnection of mOFC → NAc circuitry altered responses to wins and losses in opposing manners. Analysis of these data with a two-way ANOVA revealed a significant treatment by outcome interaction (F(2,40) = 9.72, p < 0.001). This reflected the observation that disconnection treatments caused significant reductions and increases in win-stay and lose-shift behavior, respectively, compared with control treatments (Dunnett's test, p < 0.05; Fig. 2B). Indeed, following disconnection treatments, win-stay and lose-shift values did not differ significantly from chance levels (all t values vs 50% < 1.4, all p values > 0.20). Thus, disrupting serial information transfer between the mOFC and NAc resulted in near-random patterns of responses after different risky choice outcomes as they were equally likely to stay or shift following a rewarded or a nonrewarded action.
Despite the marked disruption in optimal choice behavior, mOFC-NAc disconnection had comparatively less robust effects on other performance variables. These treatments caused a slight increase in decision latencies, but analyses of these data did not find this to be a statistically reliable effect (F(2,38) = 2.40, p = 0.10; Fig. 2C). mOFC → NAc disconnection reduced locomotion, but this effect only approached statistical significance with an alpha value of 0.05 [Saline = 1492 ± 125, Ipsilateral (Ipsi) = 1667 ± 236, Disconnection (Disc) = 1216 ± 141; F(2,38) = 3.03, p = 0.06]. These treatments also caused a small but significant increase in omissions (Sal = 0.7 ± 0.3, Ipsi = 2.0 ± 0.5, Disc = 5.5 ± 1.6; F(2,38) = 6.35, p = 0.004). Collectively, these data indicate that the mOFC input to the NAc promotes choice of the more optimal, or preferred, options during risk/reward decision-making guided by outcomes of recent actions. Furthermore, activity in this pathway appears to convey important information about profitability (or perhaps the state) of the task, for rats to establish an appropriate choice bias as mOFC-NAc disconnection rendered rats less able to incorporate action/outcome information to guide subsequent choices, leading to more random patterns of choice.
mOFC and NAc unilateral controls
Contralateral and ipsilateral disconnection of mOFC → NAc circuitry had comparable effects on decision-making. As such, it was unclear whether these effects indicated that disruption of ipsilateral communication between the mOFC and NAc was sufficient to disrupt normal choice behavior or were merely because of the effects of unilateral inactivation of either region alone. To address this, we conducted additional control experiments after the first set of microinfusion tests, wherein the same rats were retrained on the discounting task and received three more counterbalanced unilateral inactivations of the mOFC, the NAc, and a unilateral saline infusion. Importantly, these unilateral manipulations did not affect choice behavior as there was no main effect of treatment (F(2,40) = 1.05, p = 0.36) and no treatment by block interaction (F(8,160) = 1.61, p = 0.13; Fig. 2D). These important null effects confirm a critical assumption of these disconnection designs, showing that one hemisphere was in fact able to compensate for a unilateral disruption of activity from the other hemisphere. Furthermore, they indicate that the alterations in choice behavior induced by the different disconnections used here are likely because of disruption of mOFC communication with the NAc rather than perturbations induced by any single unilateral manipulation. Together, this combination of results suggests ipsilateral disruption of information transfer between the mOFC and NAc is sufficient to alter performance of this task in a manner comparable to full, bilateral disconnection of this circuitry.
Reward magnitude discrimination
Disruption of mOFC → NAc pathways reduced preference of the large/risky reward, even when the probability of obtaining the larger reward was 100%. To assess whether this was driven by a general disruption in discriminating between rewards of different magnitudes, we conducted a follow-up experiment in a separate group of rats trained on a simpler task, requiring them to choose between two levers that delivered either one or four reward pellets, both with 100% certainty. Ten rats with acceptable placements in the mOFC and NAc (Fig. 1C) were trained for ∼8 d after which they displayed a strong preference for the larger reward. Previous work by our group has shown that under these conditions, choice behavior remains goal directed rather than habitual, as reinforcer and contingency devaluation are effective at reducing preference for the large reward option (Stopper and Floresco, 2014). After training, rats received counterbalanced contralateral mOFC → NAc disconnections, as well asymmetrical saline infusions. mOFC → NAc disconnections did not alter choice, as there was no effect of treatment (F(1,18) = 1.43, p = 0.25), and no treatment by block interaction (F(3,54) = 1.16, p = 0.33; Fig. 3A). There was also no effect of choice latencies, locomotion, or trial omissions (t values < 0.75, p values > 0.48, data not shown). Collectively, these data suggest that the effects of mOFC → NAc disconnections on risk/reward decision-making are unlikely to be attributable to impairments in discriminating between smaller versus larger rewards or nonspecific disruptions in motivational or motor processes.
The fact remains that mOFC → NAc disconnection reduced preference for the larger reward during the 100% blocks of the discounting task but not during the reward magnitude discrimination. As we have argued previously (Stopper and Floresco, 2011, 2014; Montes et al., 2015), this likely reflects differences in perceptions of the relative value of the larger versus smaller reward options that emerge after experience with these two types of tasks. Thus, reward delivery in the magnitude discrimination was deterministic, which leads to more stable representation of the higher incentive value of the larger versus smaller reward option. In comparison, the nature of the probabilistic discounting task incurs volatility in the relative utility of these two options. Therefore, representations of the relative value discrepancy between the larger versus smaller reward is more labile and sensitive to disruption, even in the 100% block. One manner to demonstrate this is by comparing response latencies during trials when animals were forced to choose either the large or small reward when reward probabilities were 100%. In so doing, we observed that rats tested on the magnitude discrimination were more than four times slower to make a response when forced to select the smaller versus larger reward under control conditions (main effect of reward magnitude, F(1,9) = 36.02, p < 0.001; Fig. 3B, left). This phenomenon was unaltered by mOFC → NAc disconnections (main effect of treatment and treatment by magnitude interactions, both F values < 1.0, both p values > 0.35). In contrast, in rats performing the discounting task under control conditions, the choice latency difference between the smaller versus larger reward during the forced-choice trials in the 100% blocks was much less robust (task by magnitude interaction, F(1,29) = 22.07, p < 0.001; Fig. 3B, right). From these data, we infer that rats trained on the magnitude discrimination viewed the one-pellet option as substantially inferior to the four-pellet option, whereas for those in the discounting task, this discrepancy was muted. Furthermore, unlike what was observed in the reward magnitude discrimination experiment, mOFC → NAc disconnections abolished the difference in latencies when rats were forced to choose the small versus larger reward during the 100% block of the discounting task (treatment by magnitude interaction, F(1,19) = 6.26, p = 0.02; Fig. 3B, right). This effect was independent of whether reward probabilities decreased or increased over a session (main effect and interactions with task variant factor, all F values < 2.70, all p values > 0.10).
Collectively, these findings highlight two key distinctions in how this circuit mediates different aspects of choice involving different rewards. First, mOFC → NAc communication appears to mediate reward sensitivity and shape probabilistic choice in situations that require conceptualization of the appropriate choice strategy based on the location of a decision maker (or the state) within a given task (Sharpe et al., 2019; Bradfield and Hart, 2020) but does not appear to contribute to more basic forms of reward sensitivity required to discriminate between deterministic rewards of different magnitudes. Furthermore, the near-random patterns of risk/reward decision-making induced by disconnection of this circuit may reflect a disruption in processes involved in appraising the relative incentive value of the larger versus smaller rewards when these rewards are uncertain.
mOFC → DMS in risk/reward decision-making
We have recently characterized a role for the DMS in facilitating adjustments in decision biases during probabilistic discounting (Schumacher et al., 2021). Here, we sought to determine whether the ability of the DMS to facilitate shifts in decision biases was dependent on input from the mOFC. The overall analysis included data from 19 rats (10 descending, 9 ascending task) with acceptable placements in both regions (Fig. 1D). Data from another three rats were removed because of missed placements.
Disconnection of mOFC inputs to the DMS altered choice in a manner distinct from that observed following mOFC → NAc disconnection and was dependent on how reward probabilities varied across the session. Specifically, these manipulations markedly impaired adjustments in choice biases in response to changes in the likelihood of obtaining the larger/risky reward. Choice data from all rats were analyzed together with a three-way between/within-subjects ANOVA, with task variant (descending or ascending) as a between-subjects factor. This analysis produced a significant treatment by task interaction (F(2,34) = 8.39, p = 0.001). Follow-up analyses partitioned this interaction by task variant (descending vs ascending). Rats trained on the descending variant displayed a strong bias for the large/risky option but gradually shifted choice away from this option as reward probabilities decreased across blocks under control conditions. However, mOFC → DMS disconnections led to a greater proportion of risky choices across the session (Dunnett's test, p < 0.05 both ipsilateral and asymmetrical; Fig. 4A). In stark contrast, rats trained on the ascending task initially displayed a strong bias away from the large/risky option at the start of the session when reward probabilities were low but shifted toward the large/risky option as its profitability increased over blocks. These shifts in choice were disrupted by mOFC → DMS disconnections as rats showed reduced risky choice in subsequent blocks across the session relative to control conditions (Dunnett's test, p < 0.05 both ipsilateral and asymmetrical; Fig. 4C). In both instances, disconnection of mOFC → DMS circuitry hindered adjustments in choice biases from the preference they displayed at the start of the session.
Analysis of win-stay and lose-shift behavior exposed additional differences in how mOFC → DMS disconnections altered choice when reward probabilities decreased or increased over a session. The overall analysis of these data revealed a significant three-way treatment by outcome type (win-stay/lose-shift) by task interaction (F(2,34) = 5.85, p = 0.007), suggesting that activity within mOFC → DMS circuitry differentially regulates how rewarded versus nonrewarded actions influence subsequent choice, depending on the manner in which reward probabilities fluctuated. This interaction was initially partitioned by outcome type and revealed mOFC → DMS disconnection did not significantly alter win-stay behavior on either task variant (all F values < 2.28, all p values > 0.12). Rather, disruption of this corticostriatal circuit differentially altered action selection after nonrewarded risky choices (treatment by task interaction: F(2,34) = 7.86, p = 0.002). For animals trained on the descending variant, mOFC → DMS disconnections reduced lose-shift behavior following asymmetrical (but not ipsilateral) disconnection (F(2,34) = 3.37, p = 0.046 and Dunnett's test, p < 0.05; Fig. 4B). Thus, the increase in risky choice observed in this group was driven primarily by a reduction in the impact that nonrewarded risky choices had on subsequent action selection.
Conversely, analysis of the lose-shift data from rats trained on the ascending variant also revealed a significant effect of treatment (F(2,34) = 4.94, p = 0.013). However, in this instance, mOFC → DMS disconnection increased the tendency to shift to the small/certain option after a nonrewarded risky choice, following asymmetrical, but not ipsilateral, disconnection (Dunnett's test, p < 0.05; Fig. 4D). Thus, activity within mOFC → DMS circuitry appears to survey information regarding changes in the frequency of nonrewarded actions. When the odds of receiving larger rewards are initially high and then decrease, activity in this pathway signals the increasing frequency of losses to the DMS to favor shifts in biases as profitability wains. Conversely, when larger rewards are initially rare, the mOFC may convey information to the DMS about the reduction in losses as profitability increases to shift choice biases toward riskier, yet potentially more profitable actions.
Closer inspection of the discounting curves in Figure 4, A and C, revealed that mOFC → DMS disconnection disrupted shifts in choice biases more prominently during the initial portion of the session compared with the latter ones. Under control conditions, rats trained on either variant showed a comparatively steeper change in risky choice from the first to the third block of the session compared with the more muted change following contralateral disconnections. In contrast, the relative rate of change in risky choice from the third to the fifth block was comparable across control versus disconnection treatments, although the overall proportion of risky choices differed across treatments. To formally quantify this, post hoc analyses compared how contralateral disconnection treatments altered slopes of the discounting curves during the initial (first to third) and latter (third to fifth) blocks of the session, relative to control treatments. In this analysis, we computed these two slopes for each rat and used their absolute values in the analysis to accommodate for differences in the slopes sign (negative/positive) in the descending versus ascending task variants. We found that during the initial blocks of the session, the average slope of the discounting curve under control conditions was 0.214 ± 0.02. This was significantly reduced by mOFC → DMS disconnections (0.119 ± 0.02; F(1,17) = 7.64, p = 0.013). This confirmed statistically that the rate at which rats shifted their choice biases during the first few blocks of a session was slowed by disconnection treatments. In contrast, during the latter part of the session, the slopes of the discounting curves did not differ across treatments (control = 0.147 ± 0.03; disconnection, 0.200 ± 0.03; F(1,17) = 1.22 p = 0.28). These effects were comparable across rats trained on the descending and ascending task variants (effects of task, all F values < 1.0). Thus, the more rigid patterns of decision-making induced by disruption of mOFC → DMS circuitry were most prominent during the initial portions of the session.
With respect to other performance measures, there were no significant differences across treatments in locomotion (Sal = 1271 ± 233, Ipsi = 1597 ± 197, Disc = 1522 ± 349) or trial omissions (Sal = 2.8 ± 2.2, Ipsi = 3.7 ± 2.2, Disc = 5.1 ± 2.2; all F values < 1.99, all p values > 0.15). However, mOFC-DMS disconnection did increase choice latencies following both the ipsilateral and disconnection treatments (F(2,34) = 6.64, p = 0.004, Dunnett's test, p < 0.05; Fig. 4E) in a manner independent of the task variant (treatment by task, F(2,34) = 0.61, p = 0.55).
Collectively, these findings suggest that mOFC signals interfacing with the DMS facilitate modification of decision biases consequent to changes in the likelihood of receiving larger rewards. Disrupting the mOFC → DMS circuit on either task variant caused rats to persist with their initially more preferred option as profitability of risky options changed, accompanied by changes in negative feedback sensitivity and increased choice deliberation times.
mOFC and DMS unilateral controls
As was done in the mOFC → NAc disconnection experiment, after the first set of microinfusion test days, rats were retrained on the discounting task and then received three more unilateral infusions to ascertain whether unilateral inactivation of the mOFC or DMS had any disruptive effect on behavior. Notably, these unilateral manipulations did not affect choice behavior as there was no main effect of treatment (F(2,38) = 0.69, p = 0.51) and no treatment by block interaction (F(8,152) = 1.29, p = 0.25; Fig. 4F). These important null effects confirm that one hemisphere was in fact able to compensate for a unilateral disruption of activity from the other hemisphere. Furthermore, they indicate that the alterations in choice behavior induced by the different disconnections are more likely the result of disrupting of mOFC communication with the DMS rather than perturbations induced by any single unilateral manipulation. In this regard, previous studies using this disconnection approach have also observed perturbations in behavior after contralateral and ipsilateral disconnection of the striatum from its cortical, amygdalar, or hippocampal afferents (Bossert et al., 2012; St. Onge et al., 2012a; Warren et al., 2016; Jenni et al., 2017; Piantadosi et al., 2020), whereas only contralateral but not ipsilateral disconnections of PFC → amygdala circuitry were sufficient to alter decision-making (St. Onge et al., 2012a). We interpret this set of findings to indicate that intact communication between the mOFC and either the NAc or DMS on both sides of the brain is essential for optimal risk/reward decision-making, which may be in part because of the relatively dense contralateral connectivity between these brain regions (Hoover and Vertes, 2011).
Discussion
Here, we report that separate corticostriatal circuits linking the mOFC to the ventral and dorsal striatum regulate dissociable component processes of risk/reward decision-making. The mOFC → NAc circuitry aids in orienting choice toward a more preferred, higher valued option, whereas mOFC → DMS pathways facilitate in tracking changes in reinforcement contingencies to support flexible reward seeking.
mOFC → NAc circuitry
Perturbing mOFC → NAc circuits induced suboptimal choice patterns that manifested as a flattening of the discounting curve, reducing preference for the large reward during more profitable blocks and increasing risky choice when this option was less advantageous. This was associated with broad perturbations in sensitivity to recent outcomes as rats were equally likely to win-stay or lose-shift following a rewarded or a nonrewarded action, reflecting an impairment in using action-outcome information to guide future choice. Notably, this near-random pattern of responding complements findings implicating mOFC in promoting stimulus stickiness, which refers to the degree of persistence in choosing one option independent of prior outcomes. mOFC lesions/inactivations reduce stickiness and increase the tendency to shift choice on probabilistic reward tasks, regardless of what happened previously (Noonan et al., 2010; Verharen et al., 2020). Likewise, blockade of mOFC-D1 receptors also increases switching on both probabilistic discounting and reversal tasks (Jenni et al., 2021). The present data expand on this to suggest that mOFC → NAc circuity aids in maintaining choice biases directed toward more profitable actions.
The disruption in choice bias by mOFC → NAc disconnection was apparent even in the 100% block of the probabilistic task, although similar disconnections were without effect on a reward magnitude discrimination, where all rewards were certain. This combination of findings suggests mOFC → NAc circuitry does not influence action selection guided by basic discriminative processes when actions produce reward in a deterministic (or fully observable) manner. Instead, this circuit aids in appraising the relative incentive value of different rewards when they are tainted with uncertainty. With this in mind, contemporary theory of mOFC function posits that it influences choice when actions–outcome contingencies are not immediately observable, estimating likely outcomes to influence choice based on one's location (or state) within the action sequences of a given task (Sharpe et al., 2019; Bradfield and Hart, 2020). This idea is drawn from observations that mOFC lesions impair adjustments in behavior following outcome devaluation, or state-dependent value retrieval (Bradfield et al., 2015; Malvaez et al., 2019). In a similar vein, probabilistic (vs deterministic) schedules of reward also create unobservable situations as decision biases are shaped more prominently by broader reward history rather than any immediate outcome. The present data could imply that mOFC cells projecting to this striatal region aid in representing the task states, and disruption of this circuit led to more indiscriminate responding throughout the session.
It is interesting to note that mOFC → NAc disconnections altered choice in a manner distinct from the disinhibition of risky choice induced by bilateral inactivation of the mOFC (Stopper et al., 2014), the reduction in risky choice caused by NAc inactivation or disruption of dopamine D1 receptor modulation of PFC inputs to the NAc (Stopper and Floresco, 2011; Jenni et al., 2017), as well as the null effects of medial (prelimbic) PFC → NAc disconnection (St. Onge et al., 2012a). This highlights how targeting a subpopulation of projection neurons can reveal effects different from those produced by bilateral lesions/inactivation of a particular brain region. On the other hand, mOFC → NAc disconnection caused effects that more closely resembled those induced by inactivation of the lateral habenula (Stopper and Floresco, 2014), which also induced random choice patterns. In this regard, Stolyarova and Izquierdo, (2017) investigated mOFC contribution to choice between delayed rewards that were of equal magnitude and average cost but differed in terms of delay variability (termed “expected uncertainty”). When the delay for one of the rewards changed, control animals shifted bias toward the option with lower average delay, whereas those with mOFC lesions were insensitive to expected outcome uncertainty as their behavior was less stable/more exploratory. Analysis of reward approach behavior further revealed that mOFC lesions did not disrupt inferences of average outcome values but did impair representations of expected uncertainty. These observations dovetail with the present findings, where mOFC → NAc disconnection also rendered animals insensitive to relative reward uncertainty, inducing more stochastic choice patterns less attuned to reward probabilities. Together, these findings corroborate the idea that mOFC neurons interfacing with the NAc may stabilize value representations and the choice patterns they guide by using reward history to bias action selection based on expected outcome uncertainty. We should note that our analyses only examined how mOFC → NAc circuitry processed outcomes of the most recent choices and how it influences subsequent ones. We therefore cannot make firm claims about whether this circuit biases decision-making by integrating reward history farther in the past beyond the most recent choices, although this possibility may be addressed by future studies.
mOFC → DMS circuitry
In contrast to mOFC → NAc manipulations, perturbing mOFC inputs to the DMS impaired adjustments in choice guided by action/outcome histories. Disconnection of this circuit slowed shifts in bias away or toward the risky option when probabilities of obtaining larger rewards decreased or increased over time. This was driven by changes in negative feedback sensitivity and were remarkably similar to the effects of bilateral DMS inactivation on this task (Schumacher et al., 2021). These findings indicate that mOFC → DMS communication facilitates adjustments in choice in accordance with changes in profitability, primarily by tracking the frequency of nonrewarded actions. This is in keeping with a well-characterized role for the DMS in mediating negative feedback sensitivity across different conditions. For example, DMS cell body or dopamine lesions increase both repetition of incorrect choices (Adams et al., 2001; Skelin et al., 2014), and negative feedback sensitivity during probabilistic reversal learning (Grospe et al., 2018). Thus, the DMS appears to play a key role in redirecting responses following reward omission, a function that is critically dependent on input from the mOFC.
The mOFC makes a nuanced contribution to different forms of cognitive flexibility that is dependent in part on task parameters. Lesions/inactivation of this region have been reported to either impair (Gourley et al., 2010) or facilitate (Hervig et al., 2020) reversal learning when outcomes are deterministic. On the other hand, this region seems to be preferentially recruited in aiding flexible decision-making in situations involving reward uncertainty (Noonan et al., 2012; Dalton et al., 2016; Verharen et al., 2020; Jenni et al., 2021). Furthermore, mOFC neurons develop responses to changes in reward value, which has been posited to represent current value of stimuli or actions, which in turn can support adjustments in behavior on deviations in these values (Burton et al., 2014; Lopatina et al., 2016).
In a similar vein, the DMS is critical for selecting or adjusting actions based on their currently expected reward or on changes in their value (Corbit et al., 2001; Yin et al., 2005, 2008; Balleine et al., 2007). Accordingly, DMS dysfunction interferes with various forms of cognitive flexibility involving changes in outcome value, including reversal learning, strategy shifting, and task switching (Pisa and Cyr, 1990; Ragozzino et al., 2002, 2009; Castañé et al., 2010; Grospe et al., 2018) and flexibility during changes in probabilistic schedules of reward delivery (Torres et al., 2016; Schumacher et al., 2021). The DMS may support these functions by representing the current state of the task (i.e., a particular point within a behavioral sequence), which includes information to determine expected outcomes of different actions (Stalnaker et al., 2016). Notably, OFC lesions abolish state coding in DMS neurons (Stalnaker et al., 2016). The relative rigidity of choice caused by disrupting mOFC input to the DMS reported here may therefore reflect an impairment in progressing through different task states (in this instance, different probability blocks), leading them to linger in earlier states. Thus, whereas mOFC inputs to the NAc aid in establishing and stabilizing representations of tasks states, parallel signals to the DMS may facilitate transitions across states. Interestingly, impairments in flexible decision-making induced by disruption of mOFC → DMS circuitry were more prominent in the initial part of the session, compared with the latter part when animals obtained more choice-outcome reward history information. This suggest that this circuit may be more involved in facilitating state transitions based on a memory of task structure rather than local reward history.
It is of interest to place these findings in a broader context, comparing how DMS interacts with other prefrontal subregions to facilitate behavioral modification to changes in action value. For example, numerous studies have implicated prelimbic PFC inputs to the DMS in supporting goal-directed behavior, strategy set-shifting, and selective attention (Christakou et al., 2001; Baker and Ragozzino, 2014; Hart et al., 2018). Interestingly, prelimbic PFC inactivation altered probabilistic discounting in a manner qualitatively similar to both bilateral DMS inactivation and mOFC → DMS disconnection (St. Onge and Floresco, 2010). This does not necessarily imply that mOFC versus PFC pathways to the DMS are simply redundant circuits that aid flexibility. Rather these prefrontal regions likely contribute distinct types of information, but when probed with certain complex tasks, may show similar effects on behavior. A recent theoretical synthesis by Sharpe et al. (2019) compared prelimbic PFC versus OFC functions and how they may interface with DMS to guide behavior. The prelimbic cortex may process higher order features of a task; for example, it is necessary to shift strategy or use contextual/temporal cues (observable information) to guide appropriate behavior. Conversely, the mOFC is necessary for retrieving unobservable outcome information to guide adaptive choice (Bradfield et al., 2015). In their model, Sharpe et al. (2019) propose that the prelimbic cortex influences the DMS to represent the state space relevant to the current environment, whereas orbital regions track an actor's position within that state space. Thus, although it is apparent that DMS circuits are incorporating signals from both the mOFC and prelimbic PFC, clarifying the distinct functional contribution of these different corticostriatal pathways remains an important question for future studies. This being said, an important consideration is that the effects of the pharmacological disconnections used here may reflect not only disruption of direct pathways between the mOFC and striatal targets but also through intermediary regions that share projections with these circuits, such as the basolateral amygdala or the parafascicular thalamus (Stayte et al., 2021).
Conclusions and Implications
Nuclei within the striatum work in a coordinated manner to facilitate execution of effective strategies during probabilistic decision-making, serving to either reinforce or adjust action biases. The DMS supports execution of different response patterns when violations in reward expectations show conditions have changed, whereas the NAc directs responses in line with the subjective evaluations of value when choosing between rewards of varying magnitude and risk. The present findings reveal that both striatal regions depend on information processed by the mOFC pertaining to the profitability of different options to regulate their influence over action selection, but these parallel signals influence choice in different ways. mOFC → NAc pathways help anchor and stabilize tasks states, allowing for context-appropriate patterns of choice. This is complemented by mOFC → DMS circuits, which facilitate smooth transitions in choice biases across different tasks states in relation to fluctuations in action–outcome contingencies in male rats. Confirming that these functions generalize to females remains an important topic for future research.
Although the specific functions of the mOFC remain elusive, our findings underscore that a dissection of the specific cognitive/motivational processes subserved by the functional circuits it forms may lead to a more comprehensive understanding of its contribution to various behaviors. In turn, this may provide important insight into how pathologic changes to these circuits may contribute to cognitive dysfunction in disorders thought be driven by perturbed corticostriatal communication. Conditions such as obsessive/compulsive disorder and substance use disorders are characterized by maladaptive patterns of response persistence and behavioral rigidity (Chamberlain et al., 2006; Everitt and Robbins, 2016; Kanen et al., 2019), which have been associated with abnormal patterns of brain activation in mOFC circuits (Kalivas and Volkow, 2005; Goldstein and Volkow, 2011; Robbins et al., 2019). Understanding how these circuits regulate normal behaviors may aid in developing treatment strategies to ameliorate abnormal ones.
Footnotes
This work was supported by a grant from the Canadian Institutes of Health Research (PJT-162444) to S.B.F. and a Natural Sciences and Engineering Research Council Fellowship to N.L.J.
The authors declare no competing financial interests.
- Correspondence should be addressed to Stan B. Floresco at Floresco{at}psych.ubc.ca