Abstract
The prefrontal cortex is critical for decision-making across species, with its activity linked to choosing between options. Drift diffusion models (DDMs) are commonly employed to understand the neural computations underlying this behavior. Studies exploring the specific roles of regions of the rodent prefrontal cortex in controlling the decision process are limited. This study explored the role of the prelimbic cortex (PLC) in decision-making using a two-alternative forced-choice task. Rats first learned to report the location of a lateralized visual stimulus. The brightness of the stimulus indicated its reward value. Then, the rats learned to make choices between pairs of stimuli. Sex differences in learning were observed, with females responding faster and more selectively to high-value stimuli than males. DDM analysis found that males had decreased decision thresholds during initial learning, whereas females maintained a consistently higher drift rate. Pharmacological manipulations revealed that PLC inactivation reduced the decision threshold for all rats, indicating that less information was needed to make a choice in the absence of normal PLC processing. μ-Opioid receptor stimulation of the PLC had the opposite effect, raising the decision threshold and reducing bias in the decision process toward high-value stimuli. These effects were observed without any impact on the rats’ choice preferences. Our findings suggest that PLC has an inhibitory role in the decision process and regulates the amount of evidence that is required to make a choice. That is, PLC activity controls “when,” but not “how,” to act.
Significance Statement
This study reports causal evidence for a part of the rat prefrontal cortex, the prelimbic cortex, in controlling the amount of information needed to make a choice. Results were based on reversible inactivation using the GABAA agonist muscimol and by stimulation of μ-opioid receptors using intracortical infusions of the selective μ-agonist DAMGO. We also found evidence for a sex difference in learning and performing a visually guided two-alternative forced-choice task. Drift diffusion models found that females had stable decision processes throughout learning and showed a persistent bias against the lower-value option. In contrast, males exhibited changes in their decision processes, notably reducing the amount of information needed to make choice over the period of early choice learning.
Introduction
The prefrontal cortex is crucial for decision-making across species, with activity in this region linked to choosing between options (Hanks and Summerfield, 2017). Drift diffusion models (DDMs) have been used to provide insights into the neural computations that underlie decisions (Ratcliff et al., 2016). Human fMRI studies have reported the medial prefrontal cortex tracks the decision threshold (Domenech and Dreher, 2010), a DDM parameter that accounts for the amount of evidence needed to commit to a choice, and the starting point bias (Mulder et al., 2012), another DDM parameter that accounts for speeded performance for choices that have the largest payoff. Studies in animals establishing roles for specific prefrontal regions in modulating the parameters of DDMs are generally lacking in the literature.
The prelimbic cortex (PLC) is a core part of the rodent prefrontal cortex (Laubach et al., 2018) involved in various aspects of learning and decision-making, including instrumental learning (Corbit and Balleine, 2003; Killcross and Coutureau, 2003), categorical learning (Reinert et al., 2021), reversal learning (Nakayama et al., 2018; B. A. Bari et al., 2019; Jeong et al., 2020; Choi et al., 2023), and inhibitory control (Chudasama and Muir, 2001; Risterucci et al., 2003; Narayanan et al., 2006; A. Bari et al., 2011). To our knowledge, no published study has reported on the role of the PLC in evidence accumulation or in relation to the parameters of DDMs and other computational models of the decision process.
Other parts of the rodent cerebral cortex have been studied in the context of decision-making. Hanks et al. (2015) and Erlich et al. (2015) reported differences in the parietal and premotor cortices of rats performing an evidence accumulation task. Reversible inactivations of a premotor cortical region called the frontal orienting field (FOF), but not the parietal cortex, impaired choices that depended on evidence accumulation, but not those that simply required detection of visual stimuli. Their computational modeling suggested that the effects of FOF inactivation were on the time constant for evidence accumulation and that the FOF's role is in translating decisions into actions.
More recently, Vázquez et al. (2024) found that optogenetic inactivation of the perigenual ACC led to task disengagement, slower responding, and reduced drift rates in DDMs. Their task used a complex design, in which rats learned associations between categorical odors and delays to rewards of different sizes. Notably, Vázquez et al. (2024) did not find the effects of their optogenetic perturbations on the decision threshold, unlike the fMRI study by Domenech and Dreher (2010). Their study focused on a region of the rat PFC that is caudal to the PLC. These cortical regions have distinct anatomical connections (Gabbott et al., 2005), so it is possible that the PLC and perigenual ACC would have different roles in the decision process.
To examine this possibility, the present study used a visually guided two-alternative forced-choice task (Swanson et al., 2021; White et al., 2024). The task used dynamic visual stimuli, created using open-source LED matrices (Swanson et al., 2021). The brightness of the stimuli, effectively the number of LEDs that were active at any moment, was predictive of the concentration of liquid sucrose received at an adjacent reward port. Pharmacological manipulations of the PLC, and in a few rats the underlying ventral orbital cortex, sped up choices without affecting choice preference, while stimulation of μ-opioid receptors had the opposite effect. DDM analysis revealed that PLC inactivation reduced the decision threshold, suggesting rats needed less evidence to make a choice. In contrast, μ-opioid stimulation slowed performance and increased the threshold.
We observed sex differences as rats learned the task. Males became faster at responding during choice learning, while females responded faster overall and more selectively to the brighter stimulus associated with a larger reward. DDM analysis revealed a stable decision process in females, established during early cue learning. In contrast, males demonstrated changes in their decision process, adjusting their decision threshold over the period of choice learning.
Materials and Methods
Procedures were approved by the Animal Care and Use Committee at American University (Washington, DC) and conformed to the standards of the National Institutes of Health Guide for the Care and Use of Laboratory Animals. Data from these studies are available on GitHub: https://github.com/LaubachLab/PFC-DDM.
Animals
Nine male Long Evans rats (350–450 g, five from Charles River Laboratories, four from Envigo) and nine female Long Evans rats (200–250 g, Envigo) were used in this study. Animals were individually housed on a 12 h light/dark cycle. During training and testing, animals had regulated access to food (12–16 g) to maintain body weights at ∼90% of their free-access weights. Two males were not motivated by liquid sucrose rewards, so they were switched to water regulation early in training. They were maintained at ∼90% of their free-access weights. These animals typically consumed 10–15 ml of water during behavioral sessions and were given an additional 10–15 ml ∼4 P.M. daily with free access to food. They had 1 d per week of free access to water.
Behavioral apparatus
Animals were trained in sound-attenuating behavioral boxes (Med Associates) that had a single, horizontally placed spout mounted to a lickometer 6.5 cm from the floor with a single white LED placed 4 cm above the spout. Solution lines were connected to 60 cc syringes, and solution was made available to animals by lick-triggered, single speed pumps (PHM-100; Med Associates) which drove syringe plungers. Each lick activated a pump which delivered roughly 30 μl per 0.5 s activation. Reward availability was signaled by the illumination of a white LED located above the reward port and 0.2 s activation of a 4.5 kHz Sonalert tone (Mallory SC628HPR). On the wall opposite the spout, three 3D-printed nosepoke ports were aligned 5 cm from the floor and 4 cm apart and contained 3 mm Adafruit IR Break Beam sensors. A 1.2″ 8 × 8 pure green LED matrix (Adafruit) was placed 2.5 cm above the center of each of the three nosepoke ports outside of the box for visual stimulus presentation. LED matrices are controlled using Adafruit_GFX and Adafruit_LEDBackpack libraries, and microcontroller software is provided in a previous publication (Swanson et al., 2021).
Training procedure: cue learning
In this stage of training, the rats learned to report the location of a visual stimulus associated with a higher or lower concentration of liquid sucrose. The luminance of the visual stimuli was matched to the concentration of liquid sucrose received at the reward port. The rats first learned to detect a high-luminance stimulus and then a second low-luminance stimulus. Because the stimuli differed by luminance and were associated with different reward values, the rats would have relied on perceptual processing to recognize the difference in stimulus luminance during this stage of training. We therefore refer to this stage of training as “cue learning.”
Nine males and nine females were trained in a behavioral task described previously (Swanson et al., 2021; White et al., 2024; Fig. 1A–C). First, animals licked at a reward spout in an operant chamber to receive 16% wt/vol liquid sucrose (16 of the rats) or 60 μl of water (2 of the rats) with the LED above the spout turned on. In subsequent sessions, animals were trained using the method of successive approximations to respond in nosepoke ports in response to distinct visual stimuli. A 4 × 4 square of illuminated LEDs displayed over the center port signaled animals to initiate trials. Trial initiation was followed by lateralized presentation of one of two cues, which will be referred to as “single-offer” trials (Fig. 1B). We refer to the cues as high-value (eight illuminated LEDs) and low-value (two illuminated LEDs) throughout this paper.
Figure 1-1
Behavioral events and timeline of experiments. The timing of behavioral events in the two-alternative forced-choice task are shown in panel A. The timeline of behavioral training (procedural, cue, and choice learning) is shown in panel B. Download Figure 1-1, TIF file.
The stimuli were presented randomly by side (left or right) as single offers (one stimulus per trial). The location of illuminated LEDs in the active 8 × 8 matrix changed every millisecond during the period of illumination, which began when rats entered the central port and ended when rats entered one of the lateralized ports. Responses for high-value yielded access to a 16% wt/vol sucrose reward or 60 μl of water (two of the male rats) at the reward spout, and responses for low-value yielded a 4% wt/volume sucrose reward or 15 μl of water (two male rats). Responses to nonilluminated ports were considered errors and were unrewarded. Responses that took longer than 5 s following trial initiation were counted as errors of omission and were unrewarded. On valid trials, animals had to collect their reward within 5 s following responses to receive fluid. Animals were trained for five, 60 min sessions of at least 100 trials in single-offer sessions before moving on to dual-offer acquisition and test sessions.
We refer to the stimuli as high and low values throughout this paper. As the task design positively mapped luminance to sucrose concentration (eight LEDs → 16% sucrose, two LEDs → 4% sucrose), the stimuli could have also be characterized as high- and low-luminance stimuli (as in Swanson et al., 2021). This aspect of the design could have resulted in the salience of the stimuli contributing to the rats’ detection of the cues on single-offer trials and choices between the cues on dual-offer trials. An additional experiment was conducted, using a reversed mapping of luminance and sucrose concentration, to address this issue, and is described below.
We used differential luminance in this experiment, as well as in the studies reported by Swanson et al. (2021) and White et al. (2024). The original goal of these experiments was to test a hypothesis from studies by Kacelnik et al. (2011), which suggested that animals learn to make choices between stimuli based on single encounters with the stimuli. A key factor in testing their hypothesis using a standard operant two-alternative forced design was that the rats would show latency differences when reporting the higher- and lower-value stimuli. We initially tested how rats learned to detect stimuli that differed by luminance (the stimuli used in the present study), direction of drift, and isoluminant static patterns (based on Lashley, 1930). We found that rats came to show different response latencies within a single test session of 100 or more trials when tested with stimuli that differed by luminance, but not the other stimuli. We therefore used stimuli that differed by luminance in the present study.
Training procedure: choice learning
In this stage of training, the rats learned to make choices between the visual stimuli. As in the cue learning stage of training, animals initiated trials by nose poking at the center port. In 60 min sessions, two-thirds (∼67%) of trials were single-offer trials, as described above. In the remaining third (∼33% of trials), animals were presented with both the high- and low-value cues simultaneously, randomized by side, and the animals had the choice of responding to either cue. These trials are referred to as “dual-offer” trials (Fig. 1B). Single- and dual-offer trials were interleaved throughout the 60 min sessions. Animals experienced choice learning over five 60 min sessions before undergoing surgery. A difference in choice learning between the present study and White et al. (2024) is that the animals only experienced choice sessions during this stage of training. In the study by White et al. (2024), there were sessions with only single-offer trials interleaved between sessions with choice learning. All test sessions following surgery included mixtures of single-offer and dual-offer trials, as described here for the choice learning phase of the task.
The behavioral measures of interest are referred to as “latency,” “reward retrieval,” “time at spout,” “intertrial interval,” “high-value preference,” and “side bias” (Extended Data Fig. 1-1). Latency is measured as the time taken for rats to go from trial initiation to responding in an illuminated choice port after the onset of the visual stimuli. Reward retrieval is measured as the time it takes from making a nose poke response to the time that reward is collected. Only valid trials with latencies and reward retrievals that were <5 s were included in analyses. Time at spout is measured as the time between the first lick when a reward is collected and the last lick before a new trial is initiated. Intertrial interval (ITI) is measured as the time from one trial initiation until the next trial initiation. Only trials where ITI is <60 s were included in analyses. High-value preference is measured as the ratio of dual-offer trials that the animals responded to the high-value cue. Side bias is measured as the percentage difference from 50% that a rat chooses a given side on dual-offer trials. For example, if a rat chooses left 70% and right 30% of dual-offer trials, then this rat would have a 20% side bias.
Reversed luminance–reward mapping
To determine if stimulus salience had a role in how rats learned the values of the task stimuli and how to make choices between them, a separate cohort of six male rats was trained using a reverse mapping of luminance and reward value. For these animals, two LEDs were paired with 16% sucrose, and eight LEDs were paired with 4% sucrose. All other details of the animals’ training were the same as reported above.
Surgery
Animals were given 2–3 d with free access to food and water prior to cannula implantation surgery. Anesthesia was induced by intraperitoneal injection of ketamine (100 mg/kg for males, 75 mg/kg for females) and maintained with isoflurane (0.5–2.0%; flow rate, 5.0 cc/min). Subsequently, 0.1 ml of carprofen (50 mg/ml) was also intraperitoneally injected into the animals immediately following the ketamine injection. Animals were placed into a stereotaxic frame using nonpenetrating ear bars. The scalp was then shaved and covered with iodine, and the eyes were covered with ophthalmic ointment. Lidocaine (2%, 0.5 ml) was injected under the scalp, and a longitudinal incision was made along the skull, followed by lateral retraction of the skin. Two skull screws were placed on the caudal edges of the skull to support the adhesion of the implant. Bilateral craniotomies were made above implant locations targeting the prelimbic cortex. Twenty-six gauge stainless steel guide cannula (Plastics One) were lowered into the rostral PFC (coordinates from bregma, AP +3.0 mm, ML ±1.2 mm, and DV −2.2 mm from the surface of the brain at a 30° lateral and 12° posterior angle; Paxinos and Watson, 2014). Coordinates were adjusted to avoid blood vessels, leading to most placements being anterior to the starting coordinates. The guide cannula contained 33 ga stainless steel wire that projected 0.4 mm past the tip of the guide cannula. Craniotomies were closed using cyanoacrylate (Slo-Zap) and an accelerator (Zip Kicker), and methyl methacrylate dental cement (A-M Systems) was applied around the implants and affixed to the skull via the skull screws. Animals were given 0.5 ml of carprofen (50 mg/ml) in 500 ml of water for postoperative analgesia and recovered in their home cages for at least 1 week with full food and water with daily monitoring until weights returned to presurgical levels. Animals were then returned to regulated food or water access.
Drug infusions
The timeline of drug infusions is summarized in Extended Data Figure 1-1B. Following recovery from surgery and regulation of food or water access, animals were first reacclimated to the behavioral task until they performed as prior to surgery. Then, animals were exposed to the same duration and levels of isoflurane gas used during drug infusions to control for exposure to isoflurane. Drugs were infused under isoflurane gas to limit the effects of experimenter handling, as in previous studies from our lab (Narayanan et al., 2006; White and Laubach, 2022) and by other groups (Erlich et al., 2015). A second control was carried out the following day where animals received an infusion of 1 μl of PBS and then increasing concentrations of muscimol over 3 consecutive days (0.01, 0.1, and 1.0 μg/μl). A consequence of this design is that while neuronal effects of subthreshold concentrations of muscimol would be gone by the following daily test session (i.e., muscimol has neuronal effects for 2–3 h: Martin and Ghez, 1999), longer-term effects of the drug could occur and influence the overall pattern of results reported in this study. For example, Benkherouf et al. (2019) reported extrasynaptic effects of low concentrations of muscimol on the gamma subunit of the GABAA receptor. These receptors could alter local and distal circuits associated with the PLC.
Following muscimol test sessions, animals were run in a “recovery” session to ensure that the effects of reversible inactivation were not lasting. All rats showed normal performance in the sessions after the highest concentration of muscimol. Then, the rats were tested in a session with no infusions to serve as a local control for an infusion of DAMGO ([d-Ala2, N-Me-Phe4, Gly5-ol]-enkephalin; 1.0 μg/μl; Giacomini et al., 2021; White and Laubach, 2022), which took place the following day. A single dose of DAMGO was used due to the repeated drug infusions that were done during the inactivation sessions.
At least 1 week later, eight rats (four males and four females) were subsequently tested with unilateral infusions of muscimol (1.0 μg/μl) over 2 consecutive days, counterbalancing which hemisphere received infusions first. These additional infusions were done to test for the effects of inactivation on side bias, which was reported in a study of the frontal orienting field by Erlich et al. (2015).
All drugs were obtained from Tocris Bioscience and made into solutions using sterile PBS (pH 7.4). Infusions occurred by inserting a 33 ga injector into the guide cannula that was flush with the tip of the guide cannula. The injector was connected to a 10 μl Hamilton syringe via 0.38-mm-diameter polyethylene tubing. A volume of 1.0 μl of fluid was delivered at a rate of 0.25 μl/min with a syringe infusion pump (KD Scientific). The injector was left in place for 2 min after completion of the infusion to allow for diffusion of the solution, after which the injector was removed and the dummy cannula was replaced. Animals were tested in dual-offer sessions 1 h after muscimol infusions and 30 min after DAMGO infusions.
These studies infused a volume of 1 μl of fluid to deliver muscimol or DAMGO to the PLC. This volume of fluid has been used in previous studies, including an early methods paper on the use of muscimol for reversible inactivations (Martin and Ghez, 1999). They reported that the spread of 1 μl of muscimol, measured using a radioisotope method, was over ∼1.5 mm and did not change over a period of 2 h (their Fig. 2B2). Martin and Ghez (1999) reported that muscimol causes neuronal inhibition for periods of up to 2 h, which was validated for a fluorescently conjugated form of the drug using intracellular recordings by Allen et al. (2008). They further commented that muscimol is either bound to local GABA receptors or taken up by glia (based on Gallagher et al., 1983; Krogsgaard-Larsen et al., 1988). These processes limit the spread of muscimol from the infusion site. Based on the cannula placements, shown in Figure 1E, most of the infusions reported in the present study would have inhibited cells in the prelimbic cortex and, crucially, would have covered most cortical layers. Infusions in a few animals, in the rostral most part of PLC, would have also inactivated neurons in the underlying ventral orbital cortex.
Confirmation of cannula placement
Following all experimental test sessions, animals were anesthetized with isoflurane and injected intraperitoneally with Euthasol. Animals were then transcardially perfused with 500 ml of saline solution followed by 500 ml of 4% paraformaldehyde. Brains were removed and postfixed in 4% paraformaldehyde overnight and were then transferred to solutions containing 20% sucrose and 20% glycerol. Brains were sliced into 60 μm coronal sections using a freezing microtome and mounted onto gelatin-coated slides for Nissl staining via thionin. The thionin-treated sections were dried through a series of alcohol steps, covered with Clearium, and coverslipped. Sections were imaged using a Tritech Research scope (BX-51-F), Moticam Pro 282B camera, and Motic Images Plus 2.0 software. The most ventral point of the cannula track was compared against Paxinos and Watson (2014)’s atlas to confirm the coordinates of infusion sites. That version of the atlas uses the term Area 32 to indicate the region of the frontal cortex that is also called the prelimbic cortex.
Statistical analysis
Behavioral events were recorded through Med-PC and extracted through custom scripts written in Python (Anaconda distribution, https://www.continuum.io/). Statistical analyses were carried out using R (https://www.r-project.org/) run in Jupyter notebooks (http://jupyter.org/) via rpy2 and using the Python library, pingouin (Vallat, 2018; https://pingouin-stats.org). Distributions of response latencies were estimated using the average-shifted histogram package for R (“ash”), based on Scott (2010). Statistical tests were performed using repeated measures ANOVA (rmANOVA; R: “aov”) to account for within-subject effects of session (i.e., training session or drug infusion), sex, trial type (single-offer vs dual-offer), value (high vs low stimulus/reward), and the interactions between these variables on measures of responding (i.e., latency, reward retrieval, time at spout, ITI). Dependent variables were logarithmically transformed. Reported statistics include p-values and F statistics. Where applicable, repeated measures post hoc testing used the function “pairwise_tests” from the pingouin library for Python. Spearman rank correlation coefficients were used to compare side bias to high-value preference (R: “cor.test”). Each rat was dropped from statistical tests to confirm that effects were observed in all rats and that there were no differences between animals motivated by regulated access to food and the two male rats motivated by regulated access to water.
ExGauss modeling
Median response latencies were used as the primary measure of the speed of performance in this study. ExGauss modeling was used to further understand how learning and perturbation of the prelimbic cortex influenced performance. It is a data analysis approach that approximates the distribution of response times arising from the combination of two statistical processes. A Gaussian component accounts for the mean response time. An exponential component accounts for the “long-tail” variance of the response times. Together, the two components account for the overall shape of the response time distribution (Heathcote et al., 1991).
ExGauss models have three parameters. They are mu, the mean of the Gaussian distribution; sigma, the standard deviation of the Gaussian distribution; and tau, the exponential distribution. All three parameters were estimated, but the sigma parameter was not found to vary in any systematic fashion and was not considered further. As there were fewer low-value choices on dual-offer trials compared with the other trial types, we focused on differences between high- and low-value single-offer trials and high-value dual-offer trials in interpreting results from the ExGauss models.
To estimate the parameters of the ExGauss models, we used the MMest.ExGauss function from the RobustEZ package (Wagenmakers et al., 2008). Effects of the ExGauss parameters on behavior were first assessed using multivariate ANOVA (statsmodels library for Python). rmANOVA was then used to determine the effects of session, sex, trial type, value, and the interactions between these variables on the ExGauss parameters.
Drift diffusion models
The HDDM package (Wiecki et al., 2013; version 0.9.6) was used to quantify the effects of learning, sex, and drugs on decision-making. We used the HDDM package to estimate four key parameters of DDM models, the drift rate, decision threshold, starting point bias, and non-decision time (Fig. 1D). Drift rate accounts for how quickly the rats integrate information about the stimuli. Threshold accounts for how much information is needed to trigger a decision. Starting point bias accounts for variability in the starting point of evidence accumulation. Non-decision time accounts for the time taken to initiate stimulus processing and execute the motor response (choice).
HDDM models were fit that allowed for a single DDM parameter (drift rate, threshold, starting point bias, non-decision time) to vary freely over sessions (e.g., choice learning or drug dose). The other parameters were estimated globally (i.e., using data from all sessions in a given experiment: cue learning, choice learning, muscimol, or DAMGO). Hierarchical models were trained using data from either single-offer or dual-offer trials. Models were fit by running version 0.9.6 of the package under Python 3.7. Parameters were from Pedersen et al. (2021): Models were run with 50,000 samples, and the first 25,000 samples were discarded as burn-in, and every 10th model was retained for parameter estimation (thinning of 10). These parameters were necessary to remove minor autocorrelation in the posteriors for the starting point bias parameter and were then used for all summaries in this paper. HDDM is sensitive to outliers (Wiecki et al., 2013), so we included latencies up to the 95th percentile of the distribution in our analyses. Exploratory data analysis found that the 95th percentile cutoff was approximately the same for the response time distributions from the male and female rats and across sessions in the muscimol and DAMGO experiments.
Convergence was validated based on the Gelman–Rubin statistic (Gelman and Rubin, 1992), which was below 1.1 for all models reported in this paper, with a posterior predictive check that compared response times from the rats and the models, and by plotting the probability density functions for the observed and predicted response times for each rat. The autocorrelations and distributions of the parameters and predictions of the response latency distributions for each animal were visually assessed to further assess convergence.
Details on model fitting, convergence checks, and effects of running models with each rat removed, to ensure that no single rat drove the results reported in this paper, are provided in the Extended Data Figure 6-1. We also include code for our use of the HDDM package, as a Jupyter notebook, and raw data files for cue learning, choice learning, testing with muscimol, and testing with DAMGO in the extended data.
Validation of results from HDDM using other sequential sampling models
This study's main findings were based on the HDDM package (Wiecki et al., 2013). We aimed to confirm the effects of PFC inactivation and opioid stimulation with an alternative implementation of the standard DDM model and with a different type of sequential sampling model, the linear ballistic accumulator (LBA) method (Brown and Heathcote, 2008). We used the rlssm package (Fontanesi, 2021). While rlssm allows for hierarchical Bayesian modeling, it doesn't allow for simultaneous evaluation of factors like training session, sex, or drug within a single model (unlike HDDM).
To evaluate the effects of PFC inactivation and μ-opioid stimulation, we ran separate DDM and LBA models for each dataset. For muscimol, we compared models based on sessions after PBS infusions and sessions after the highest concentration of muscimol (1.0 μg/μl). For DAMGO, we compared models from the control sessions prior to testing with DAMGO with the sessions under DAMGO. Models were fit for trials with dual offers.
We did not use the rlssm package to analyze choice learning data because it would have required fitting separate models for each session and sex. Initial tests showed significant differences in the qualities of model fits between DDM and LBA models on single behavioral sessions and between the sexes. These models used only a tenth of the data used for our HDDM models (five training sessions and two sexes), and their parameter estimates were unreliable.
For the modeling using rlssm, we ensured model convergence as in our HDDM analysis. The Gelman–Rubin statistic was under 1.1 for all models. We also evaluated the “widely applicable information criterion” (WAIC; Watanabe, 2013), which accounts for the accuracy and complexity of the models. WAIC values for control and drug sessions were comparable. Additionally, we conducted posterior predictive checks, simulated choices and latencies from the models and compared them with the observed data, and assessed that the sampled posteriors for the model parameters varied randomly over samples, lacked significant autocorrelation, and had symmetric distributions around their means.
Results
Effects of cue learning on performance and decision-making
Rats performed well in the cue learning stage of training, with high detection rates for both stimuli (Fig. 2A). They detected the high-value stimulus more frequently (median = 96.01%) compared with the low-value stimulus (median = 93.39%, paired t test: p = 0.005, t(17) = 3.184). All rats responded faster over the training sessions (p = 1.22 × 10−5, F(4,142) = 7.696; Fig. 2B), with consistently shorter latencies when responding to the high-value stimulus compared with the low-value stimulus (p = 1.12 × 10−13, F(1,142) = 67.674). Additionally, females responded significantly faster than males across all sessions (p = 0.030, F(1,15) = 5.720).
Figure 2-1
Reward retrieval times and inter-trial intervals. During cue learning, reward retrieval times (p = 1.03e-05, F(4,139) = 7.816) and ITI (p = 3.91e-05, F(4,140) = 6.952) decreased over sessions, and females retrieved rewards more quickly than males (p = 0.01, F(1,15) = 8.852). Download Figure 2-1, TIF file.
Figure 2-2
Consummatory measures. During cue learning, we observed no differences between female and male rats over four measures of the subjective value of the liquid sucrose rewards. These include the median licking frequency (A), the number of licks in a bout (B), the median duration of the bouts (C), and the median number of licks in a bout (D). Download Figure 2-2, TIF file.
Two other key measures of performance were reward retrieval times (a measure of the incentive salience of the rewarding fluids) and the duration of the rats’ licking bouts (a measure of the subjective values of the fluids to the rats). There were trends toward faster retrieval times (p = 0.057, F(4,28) = 2.613) and intertrial intervals (ITI; p = 0.055, F(4,28) = 2.640) across sessions. Notably, females retrieved rewards significantly faster than males (1.763 s for males vs 0.977 s for females, F(1,7) = 7.015, p = 0.033). We did not observe any difference between the females and males over four common measures of subjective value during reward consumption (Extended Data Fig. 2-2). These findings suggest that females responded faster than males potentially due to a sex difference in motivation (incentive salience, indicated by the faster retrieval times) rather than a difference in the subjective values of the rewarding fluids.
Drift diffusion models revealed several key aspects of cue learning. As expected, the drift rate was higher for trials with the high-value stimulus for both sexes (Fig. 3A,B). Threshold was generally lower for the females (Fig. 3C,D) and was lower for the high-value stimulus compared with the low-value stimulus for both sexes. Interestingly, females did not show a decrease in threshold for the high-value option across learning sessions. Most notably, starting point bias increased over learning for both sexes (Fig. 3E,F). Females initially showed a negative bias for the low-value option (Fig. 3E), suggesting they needed more evidence to detect it. The negative bias faded with learning, reaching a neutral level (∼0.5) by the third session of cue learning. In contrast, males showed no such shift in bias for the low-value stimulus (Fig. 3F), which remained near 0.5 over the period of cue learning. Finally, non-decision time was lower and less variable for females compared with males, but did not change with learning or stimulus value (Fig. 3G,H).
Effects of choice learning on performance
Over the five sessions of choice learning, the rats experienced dual-offer trials on one-third of the trials. The other trials were the same as during cue learning, with one stimulus presented on each trial. Detection rates were measured on these single-offer trials (Extended Data Fig. 4-1). The rats detected the high-value stimulus more (97.703%) often than the low-value stimulus (94.768%; paired t test on median detection rates: p = 0.003, t(17) = 3.448). There was no difference in median detection rates for females and males for the high-value stimulus (t test: p = 0.675, t(16) = −0.425) or low-value stimulus (p = 0.102, t(16) = −1.734).
On dual-offer trials, females demonstrated greater high-value preference than males across all five sessions (p = 0.001, F(1,16) = 15.99; Fig. 4A). Males motivated by food or fluid regulation showed overlapping values for this measure (Extended Data Fig. 4-2A). Latencies to the high-value stimuli were briefer for all rats compared with latencies to the low-value stimulus (p ≤ 2 × 10−16, F(1,304) = 128.948; Fig. 4B). Latencies were also briefer for single-offer trials compared with dual-offer trials (p = 7.59 × 10−11, F(1,304) = 45.534). There was a significant interaction between session and sex (p = 0.030, F(4,304) = 2.711). Post hoc tests revealed that males, but not females, showed reduced choice latencies on dual-offer trials, but not latencies on single-offer trials, over sessions. A significant interaction between value and sex (p = 0.022, F(1,304) = 5.257) was also found, with females responding more quickly than males specifically to the high-value stimulus. As above, males motivated by food or fluid regulation showed overlapping values for this measure (Extended Data Fig. 4-2B).
Figure 4-1
Stimulus detection during choice learning. Both females (blue) and males (green) detected the stimuli with rates above 90% for most sessions. Download Figure 4-1, TIF file.
Figure 4-2
Choice preferences and latencies for rats trained with liquid sucrose or water rewards. Choice preferences are shown in panel A. Choice latencies are shown in panel B. Rats trained and tested with liquid sucrose as the reward are shown as blue points. Rats trained and tested with water as the reward are shown as green points. The values for both measures overlapped by the rewarding fluid and randomly varied over sessions. These data indicate that the type of regulation (regulated-access to food and liquid sucrose rewards or regulated-access to fluid and water rewards) did not alter the learning or performance of the rats. Download Figure 4-2, TIF file.
Differences in the shapes of the distributions of the response latencies were apparent between the male and female rats. Examples are shown in Figure 4C. The male rats showed reductions in the long tails of the latency distributions over the period of choice learning. In contrast, distributions from the females showed more extended tails. These features of the latency distributions were quantified using ExGauss modeling (Heathcote et al., 1991). This analysis accounts for the Gaussian mean (peak) and exponential variability (tail) of the latency distributions. Mean latencies were lower for females compared with males (p = 0.001, F(1,14) = 16.997; Fig. 4D). There was also a significant effect of the number of offers on the Gaussian mean (p = 0.009, F(2,218) = 4.74). Post hoc testing revealed that the Gaussian mean was significantly lower for single-offer, high-value trials than single-offer, low-value trials (p = 0.033). For exponential variability, males showed lower overall variability (p = 0.043, F(1,14) = 4.938) and reduced variability over the period of choice learning (p = 5.15 × 10−5, F(4,218) = 6.577; Fig. 4E). A significant interaction between the number of offers and sex was found (p = 6.6 × 10−6, F(2,109) = 13.334). Post hoc testing revealed that in females, exponential variability was significantly greater on single-offer, low-value trials compared with both single-offer, high-value (p = 6.4 × 10−6) and dual-offer, high-value (p = 0.006) trials (Fig. 4E).
Beyond these effects on response latency, females persisted in retrieving rewards somewhat faster than males, but the difference by sex was no longer significant (p = 1.452, F(1,8) = 0.263). Retrieval times did not change over sessions (p = 0.614, F(4,32) = 0.656). Females retrieved rewards in 1.396 s [credible interval (CI), 1.345–1.448] and males in 1.669 s (CI, 1.397–1.942). None of the rats showed evidence of a side bias during choice learning (p = 0.822, F(4,64) = 0.38).
Effects of choice learning on decision-making
Drift diffusion models found that the females showed generally stable DDM parameters across sessions of choice learning (Fig. 5). In contrast, the males showed changes in threshold (Fig. 5B) and starting point bias (Fig. 5C). Threshold was reduced, and starting point bias was increased in the fourth and fifth sessions of choice learning compared with the first session of choice learning. These changes in the DDM parameters suggest that the males needed less information to choose after the third session of choice learning and less information to chose the high-value stimulus compared with the low-value stimulus in those sessions. This finding tracks the changes in response time variability from the ExGauss models shown in Figure 4E.
Other notable differences between the females and males were that females had higher drift rates compared with males over the first four sessions of choice learning (Fig. 5A), higher starting point bias in the first two sessions of choice learning (Fig. 5C), and higher overall non-decision times across all sessions of choice learning (Fig. 5D). To summarize differences the DDMs for the males and females, representations of the DDMs are shown in Figure 5E,F. These represent parameters estimated across four different models for the dual-offer trials, each allowing one DDM parameter to freely vary by sex and learning session. Models are shown for the first and fifth sessions of choice learning. The overall impression from these plots is that the females showed a stable model of the decision process over the period of choice learning. In contrast, the males showed a reduction in the amount of information they needed to trigger a choice. Nevertheless, they still showed lower choice preferences for the high-value stimulus and longer choice latencies compared with the female rats in the fifth session of choice learning.
Prior to fitting the models described above, we fit models separately to the data sets from the female and male rats, to ensure that DDMs for each sex fit the animals’ data equally well (Extended Data Fig. 5). Four models were fit with each of the four DDM parameters free to vary with the sex of the rats. The models found an overall higher drift rate and starting point bias and lower non-decision time for the females compared with the males (Extended Data Fig. 5-1). The Gelman–Rubin statistic is reported for the models, with statistics for each rat, in Extended Data Table 5-1. Plots of the posterior probabilities for each DDM parameter are shown in Extended Data Figures 5-2 and 5-3. These include traces of the posteriors over the 2,500 samples (showing generally random fluctuations, commonly referred to as “hairy caterpillars” in the DDM literature (Krypotos et al., 2015), the autocorrelation of the samples (generally flat), and histograms of the samples (symmetric around the mean). The autocorrelation for the starting point bias showed higher values for models based on males compared with females, specifically at short lags. This pattern of autocorrelation was not observed for the combined models, in which starting point bias varied with session and sex, as summarized in Figure 5C.
Figure 5-1
DDM parameters for model comparing males and females across sessions. The plots show the posterior distributions and 95% Bayesian Credible Intervals for the four parameters from hierarchical HDDM models in which each parameter depended on sex. Data were from all five choice learning sessions. Posteriors are shown in blue for female rats and in green for male rats. The Credible Intervals are shown above each of the posterior distributions. For Drift Rate, fewer than 0.2% of the posteriors from the female rats were less than the mean from the male rats (i.e., p < 0.002). For Threshold, the posteriors were overlapping, but see the main body of the paper where an effect of choice learning session on Threshold for the male rats is described. For Starting Point Bias, fewer than 3% of the posteriors from the female rats were less than the mean from the male rats (i.e., p < 0.03). For Non-Decision Time, 2% of the posteriors from the female rats were greater than the mean from the male rats (i.e., p = 0.02). Dynamics of these parameters over sessions with choice learning are described in the paper. Download Figure 5-1, TIF file.
Figure 5-2
Quality of posteriors for each DDM parameter. The plots show the traces from 2500 HDDM models, the autocorrelation over the traces, and the distribution of the estimated parameters of the DDM models. These models were fit separately with data from the male and female rats to ensure that the quality of the models was comparable for the sexes. Panel A shows data for the drift rate. Panel B shows data for threshold. Panel C shows data for starting point bias. Panel D shows data for non-decision time. Data for females are on the top and males on the bottom in each panel. Download Figure 5-2, TIF file.
Figure 5-3
Posteriors for starting point bias from models based on all data. Starting point bias was stable in models fit with HDDM for the combined data from the male and female rats. These are the models reported in Figure 5. Download Figure 5-3, TIF file.
Figure 5-4
Response time distributions: male rats. Histograms are shown for the observed (blue) and simulated (red) response times for the male rats from the HDDM models. Latencies for trials with choices of the high value stimuli are shown as positive values. Latencies for trials with choices of the low value stimuli are shown as negative values. Actual names of the rats are listed, and correspond to those in the shared data set, as noted in the Methods. Download Figure 5-4, TIF file.
Figure 5-5
Response time distributions: female rats. Histograms are shown for the observed (blue) and simulated (red) response times for the female rats. Latencies for trials with choices of the high value stimuli are shown as positive values. Latencies for trials with choices of the low value stimuli are shown as negative values. Actual names of the rats are listed, and correspond to those in the shared data set, as noted in the Methods. Download Figure 5-5, TIF file.
Figure 5-6
Results did not depend on any single rat. Eighteen HDDM models were fit for the data, with each of the 18 rats removed from one of the models. Results show that effects did not depend on any single rat. The largest difference in the DDM parameters between models based on males and females was drift rate. Threshold was overlapping for models based on males and females. Starting point bias was higher in females. Non-decision time was higher in males. Parameters from males are shown in green. Parameters from females are shown in blue. Xs denote the average value for males and females. Download Figure 5-6, TIF file.
Table 5-1
Gelman-Rubin statistic. This statistic measures the extent to which posterior distributions are similar over five runs of 2500 hierarchical HDDM models (samples: 50,000, burn-in: 25,000, skip: 10) for data from the male or female rats. Download Table 5-1, XLSX file.
Table 5-2
Posterior Predictive Checks: summary statistics for male rats. Summary statistics are reported for posterior predictive checks for the HDDM model based on data from the male rats. Statistics include the observed and simulated accuracy of task performance, and the mean, standard deviation, and quantile estimates of the upper and lower boundaries. The upper boundary represents choices of the high-value stimulus. The lower boundary represents choices of the low-value stimulus. SEM reports the squared difference between the mean observed data and the mean of simulated data sets. Credible reports whether the estimates from the observed and simulated data are within the 95% credible interval. Download Table 5-2, XLSX file.
Table 5-3
Posterior Predictive Checks: summary statistics for female rats. The columns in the table below report the same statistics as those in Table 5-2. Download Table 5-3, XLSX file.
Posterior predictive checks are provided in Extended Data Table 5-2 (male rats) and Extended Data Table 5-3 (female rats). These report statistics for the observed and simulated choice preferences and latencies for the DDMs and the rats. All measures reported in this paper were evaluated as being within the Bayesian credible intervals. Response time distributions for the male and female rats and the DDM predictions of their response times are shown in Extended Data Figures 5-4 and 5-5. The distributions were highly similar for all rats included in this study. To ensure that no single rat drove any of the effects in the DDM models, models were run with each rat removed from the data set, and the results are shown in Extended Data Figure 5-6. No single rat drove the results.
The sex differences during choice learning based on the rats’ performance in dual-offer trials were also apparent in single-offer trials (Fig. 6). DDM parameters were stable over sessions of choice learning for the females and showed cross-session changes for the males (Fig. 6). Drift rate was higher for the high-value stimulus for both the females (Fig. 6A) and males (Fig. 6B), as was found during cue learning (Fig. 3). However, a difference by sex was that the males showed an increase in drift rate over the five-choice learning sessions, an effect that was not found in the females. Threshold was generally higher in males (Fig. 6D) compared with females (Fig. 6C), as was also observed in cue learning. However, only males showed reductions in threshold across the sessions of choice learning. For starting point bias, the negative starting point bias of females for the low-value stimuli was again notable (Fig. 6E). There was a clear separation in starting point bias by stimulus reward value in the females, and no effects of choice learning session on the parameter. In contrast, males showed more overlapping levels of starting point bias and an increase in bias over the period of choice learning (Fig. 6F). Finally, while non-decision times were generally lower in females, there was no effect of stimulus reward value and no effect of choice learning on this parameter of the DDMs.
Effects of reversible inactivation of the prelimbic cortex
To understand the role of the rat prelimbic cortex (PLC) in decision-making, we implanted the animals described above with an infusion cannula (Fig. 1E) and tested them after infusions of increasing doses of muscimol. Cannula in a few of the rats terminated deep in the PLC, so infusions in those rats would have also inactivated neurons in the underlying ventral orbital cortex. Reversible inactivation did not change the overall pattern of results for the effects of sex, stimulus value, or number of offers, as reported above during the choice learning sessions. As such, sex as a biological variable was not assessed for the effects of PLC inactivation.
Detection rates not affected by muscimol (high-value stimulus: p = 0.247, t(17) = 1.196; low-value stimulus p = 0.06, t(17) = 1.956; Fig. 7A). The rats detected the high-value stimulus better than the low-value stimulus over the sessions in the muscimol experiments (p = 0.003, t(17) = 3.373). Four of the rats showed lower levels of detection under the highest concentration of muscimol.
Choice preference on dual-offer trials did not change at any dose of muscimol (p = 0.122, F(3,48) = 2.029; Fig. 7B). Most rats chose the high-value stimulus on >70% of the trials. The same four rats who showed reduced detection rates on single-offer trials showed lower high-choice ratios on dual-offer trials.
The main effect of muscimol was on the rats’ response latencies (Fig. 7C). As in choice learning, the rats responded faster to the high-value stimulus compared with the low-value stimulus (single offers, p = 1.3 × 10−5, F(1,17) = 36.699; dual offers, 8 × 10−7, F(1,16) = 59.899). With increasing concentration of muscimol, the rats had shorter median response latencies on the single-offer (p = 0.031, F(3,51) = 3.185) and the dual-offer trials (p = 0.023, F(3,48) = 3.458). There were no interactions between stimulus reward value and muscimol concentration for either type of trial (degrees of freedom differ between the trial types because some rats only chose the high-value option in some of the sessions). Post hoc testing found pairwise differences between the highest concentration of muscimol and PBS for both the single-offer (p = 0.028, t(17) = 2.400) and dual-offer (p = 0.006, t(17) = 3.086) trials.
To understand how inactivation of PLC affects the decision process, we fit HDDM models to the data from the series of muscimol inactivation sessions (Fig. 8). Separate models were fit for each of the four DDM parameters, in which one parameter was allowed to freely vary over doses of muscimol and the other parameters were estimated globally. The effects of PLC inactivation on dual-offer trials were specific to two parameters. Threshold decreased with increasing dose of muscimol (Fig. 8B). The probabilities that threshold was greater in the sessions with PBS and 0.01 µg/µl of muscimol compared with the sessions with 1.0 µg/µl of muscimol were 0.0028 and 0.0072, respectively. Starting point bias also increased with increasing concentration of muscimol (Fig. 8C), from ∼0.53 in the PBS and 0.01 μg/μl sessions to ∼0.56 in the 0.1 and 1.0 μg/μl sessions. This parameter was the only behavioral or DDM measure that changed between the 0.01 and 0.1 μg/μl concentrations of muscimol and could reflect a subthreshold effect of muscimol on the animals’ performance strategy. This would indicate a selective reduction in the amount of information needed to choose the higher-value option after the animals experienced the first session with the lowest concentration of muscimol. The distributions for drift rate and non-decision time were fully overlapping over the range of doses for muscimol (Fig. 8A,D). Taken together, these findings suggest that inactivation of the rat PFC reduced the amount of information needed to trigger a choice, especially when the rats chose the high-value stimulus. This result is apparent in Figure 8E, which combines the effects across the HDDM parameters.
Figure 8-1
PFC inactivation results analyzed with two other sequential sampling models. Effects of PFC inactivation on decision threshold were confirmed using two alternative methods for sequential sampling modeling. Panel A shows parameters from Drift-Diffusion Model (DDM) fits using the rlssm package. Consistent with HDDM results, PFC inactivation lowered the decision threshold. These models also showed a decrease in drift rate with PFC inactivation. Panel B presents parameters from Linear Ballistic Accumulator (LBA) models. Similar to HDDM, the distance to threshold (parameter A in LBA models) decreased with PFC inactivation. These models also showed a reduction in non-decision time with PFC inactivation. In all three sequential sampling models, PFC inactivation consistently reduced the decision threshold. Asterisks: ** p < 0.01, * p < 0.05. Download Figure 8-1, TIF file.
Figure 8-2
HDDM models for single-offer trials in the PFC inactivation experiments. For single-offer trials, there were increasing effects of the concentration of muscimol on drift rate (A) and starting point bias (C), and decreasing effects on threshold (B) and non-decision time (D). Drift rate was significantly higher in the 1.0 ug/ul sessions compared to the PBS control sessions. Threshold and non-decision time were lower and starting point bias was higher at 1.0 ug/ul compared to the PBS and 0.01 ug/ul sessions based on Bayesian credible intervals (p < 0.05). Asterisks: * p < 0.05. Download Figure 8-2, TIF file.
We fit decision-making models using two other methods to ensure that our findings did not depend on the use of the HDDM package. First, we fit standard DDM models using the rlssm package. PLC inactivation reduced drift rate and threshold (Extended Data Fig. 8-1A). The Bayesian estimate of the mean drift rate from the PBS sessions was 0.945 [credible interval (CI), 0.744–1.137] and 0.706 (CI, 0.474–0.940) from the 1.0 μg/μl muscimol sessions. Fewer than 3% of posteriors for drift rate from the muscimol sessions were larger than the mean drift rate from the PBS sessions (p = 0.023). The Bayesian estimate of the mean threshold from the PBS sessions was 1.556 [credible interval (CI), 1.422–1.699] and 1.414 (CI, 1.264–1.569) from the 1.0 μg/μl muscimol sessions. Fewer than 4% of posteriors for the threshold from the muscimol sessions were larger than the mean threshold from the PBS sessions (p = 0.036). No effects of PLC inactivation were found for starting point bias or non-decision time using the DDM models in rlssm. The common finding between the results from models from HDDM and rlssm's DDM implementation was on threshold. It was found to be reduced in both types of DDM modes.
Second, we fit linear ballistic accumulator (LBA) models, again using the rlssm package, and PLC inactivation decreased the “distance to decision threshold” parameter (called A in the LBA models) and the non-decision time compared with PBS controls (Extended Data Fig. 8-1B). For the PBS sessions, the Bayesian estimate of the mean distance to threshold was 1.457 (CI, 1.023–1.903). For the muscimol sessions, the mean distance to threshold was 0.687 (CI, 0.284–1.208). Fewer than 1% of posteriors for distance to threshold from the muscimol sessions were larger than the mean distance to threshold from the PBS sessions (p = 0.003). The Bayesian estimate of non-decision time was 0.196 (CI, 0.123–0.261) from the PBS sessions and 0.124 (CI, 0.071–0.177) from the muscimol sessions. Fewer than 1% of posteriors for non-decision time from the muscimol sessions were larger than the mean non-decision time from the PBS sessions (p = 0.002). There were no differences in drift rate or initial evidence in the LBA models. These results validate the results reported above for the effects of PLC inactivation on the threshold parameter of the HDDM and rlssm's DDM models.
For single-offer trials, all four parameters of the DDMs differed at the highest concentration of muscimol (1 μg/μl) compared with the PBS control session (Extended Data Fig. 8-2). These models evaluated the dependence of each DDM parameter on the concentration of muscimol, and did not include a further dependence on stimulus reward value as reported above for the cue and choice learning experiments. Threshold, starting point bias, and non-decision time were also different between the 0.01 and 1 μg/μl concentration of muscimol. Overall, PLC inactivation led to a higher drift rate, lower threshold, higher starting point bias, and lower non-decision time. That is, inactivation of PLC increased the speed of the detection process.
Other behavioral measures did not show the effects of PLC inactivation. Most notably, median reward retrieval times were similar across concentrations of muscimol [e.g., 1.809 s (CI, 1.643–1.974) under PBS and 1.634 s (CI, 1.491–1.777) under 1.0 μg/μl muscimol]. This result suggests that PLC inactivation did not change the animal's incentive salience for the task stimuli and the resulting rewards. As in the sessions with cue and choice learning, females persisted in showing shorter reward retrieval times throughout the muscimol experiments. As a result, while there was an effect of sex on retrieval times (p = 0.002, F(1,8) = 19.023), there was no effect of muscimol (p = 0.254, F(3,24) = 1.446) or interaction between sex and drug (p = 0.292, F(3,24) = 1.317).
We also measured whether there were correlations between the median latency for the high-value choices, the median retrieval times for high-value choices, and high-value choice preferences. For both sexes, we found no significant correlations between choice latency and choice preference (Extended Data Fig. 8-3), reward retrieval and choice preference (Extended Data Fig. 8-4), or reward retrieval and choice latency (Extended Data Fig. 8-5), and no effects of any concentration of muscimol on the relationships between these measures.
Figure 8-3
Relations between choice latency and choice preference by sex. Scatterplots with linear regression fits for the relations between choice latency and choice preference from the male and female rats. No significant relations between these measures were found based on the Pearson correlation statistic. While the correlations appear to be stronger in males, the correlation coefficients were below 0.5 for all data and none were significant at p < 0.05. Download Figure 8-3, TIF file.
Figure 8-4
Relations between reward retrieval and choice preference by sex. Scatterplots with linear regression fits for the relations between reward retrieval and choice preference from the male and female rats. No significant relations between these measures were found based on the Pearson correlation statistic. Females retrieved rewards faster (mean ± CI: 1.581 ± 1.511-1.651), than males (1.912 ± 1.800-2.023). While the correlations appear to be stronger in females, the correlation coefficients were below 0.6 for all data and none were significant at p < 0.05. The correlation between reward retrieval and choice preference for the females under the highest concentration of muscimol was -0.565 (p = 0.112). Download Figure 8-4, TIF file.
Figure 8-5
Relations between reward retrieval and choice latency by sex. Scatterplots with linear regression fits for the relations between reward retrieval and choice latency from the male and female rats. A generally positive relationship was observed; however, the only significant relation between these measures was for the females in the PBS sessions (Pearson correlation: 0.767, p = 0.0157). The correlation between these measures for the males in those sessions was 0.639 (p = 0.063). Download Figure 8-5, TIF file.
Other behavioral measures were not affected by PLC inactivation. There were no effects on the number of trials initiated (p = 0.551, F(3,48) = 0.710), the number of trials completed (p = 0.452, F(3,48) = 0.892), the percentage of omission errors (p = 0.288, F(3,48) = 1.292), the number of low-value rewards received (p = 0.609, F(3,48) = 0.615), or the number of high-value rewards received (p = 0.905, F(3,48) = 0.186). These results suggest that there were no changes in the rats’ abilities to perform the task. There were no differences by sex or due to the concentration of muscimol on relations between choice latencies and choice preferences (Extended Data Fig. 8-3), reward retrieval times and choice preferences (Extended Data Fig. 8-4), or reward retrieval times and choice latencies (Extended Data Fig. 8-5). Finally, there was no evidence for increased side bias with increasing muscimol concentrations (p = 0.078, F(3,48) = 2.413).
Four males and four females were further assessed for the effects of unilateral inactivation (data not shown). Latencies of ipsilateral and contralateral responses to the hemisphere of the infusion did not differ from one another (p = 0.306, F(1,104) = 1.058). There was no side bias caused by unilateral infusions of muscimol (p = 0.537, F(1,6) = 0.426), and side bias was weakly negatively correlated to high-value preference (r(14) = −0.491, p = 0.05558).
Effects of μ-opioid receptor stimulation in the prefrontal cortex
In sessions with DAMGO infusions (1 μg/μl) and the DAMGO-control sessions, effects of sex, value, and trial type were the same as those reported during acquisition and were unaffected by DAMGO. As such, sex as a biological variable was not assessed for the effects of DAMGO. Similar to reversible inactivation, detection rates (high-value stimulus: p = 0.109, t(17) = −1.687; low-value stimulus: p = 0.1035, t(17) = −1.720; Fig. 9A) and choice preferences (p = 0.622, F(1,16) = 0.253; Fig. 9B) were not affected by DAMGO. However, in contrast to muscimol, DAMGO resulted in longer response latencies (p = 5.32 × 10−8, F(1,111) = 34.089; Fig. 9C). Other behavioral measures included reward retrieval being longer in sessions with DAMGO in PFC (p = 6.62 × 10−16, F(1,111) = 89.300) and ITI (p = 2.42 × 10−5, F(1,111) = 19.434). This result suggests that DAMGO in the PLC reduced the animals’ incentive salience for the task stimuli and the rewards associated with them.
Other behavioral measures were not affected by the DAMGO infusions into PLC. There was no effect on the number of trials initiated (p = 0.789, F(1,16) = 0.074), the number of trials completed (p = 0.760, F(1,16) = 0.097), the percentage of omission errors (p = 0.276, F(1,16) = 1.271), the number of low-value rewards received (p = 0.715, F(1,16) = 0.138), or the number of high-value rewards received (p = 0.876, F(1,16) = 0.025). Finally, there was no evidence for a side bias caused by DAMGO (p = 0.169, F(1,16) = 2.078).
HDDM modeling revealed nonspecific effects of cortical opioid stimulation on the parameters of the DDMs that contrasted with the effects of PLC inactivation (Fig. 10). For the dual-offer trials, the only parameter that was not changed by DAMGO was drift rate (Fig. 10A). DAMGO increased the threshold (Fig. 10B), reduced the starting point bias (Fig. 10C), and increased the non-decision time (Fig. 10D). The effects across the multiple parameters of the HDDM models are apparent in Figure 10E, which combines effects across the HDDM parameters. These results show that a different kind of perturbation of PLC neuronal activity (stimulation of μ-opioid receptors, likely increasing tonic activity in PLC through disinhibition of parvalbumin interneurons; Lau et al., 2020) can have distinct effects on choice behavior and the decision process. In this case, for DAMGO, the effects were generally opposite of those reported above for reversible inactivation by muscimol.
Figure 10-1
DAMGO results analyzed with two other sequential sampling models. Effects of mu opioid stimulation in the PFC were characterized using two alternative methods for sequential sampling modeling. Panel A shows parameters from Drift-Diffusion Model (DDM) fits using the rlssm package. Consistent with HDDM results, DAMGO increased the decision threshold and the non-decision time. Panel B presents parameters from Linear Ballistic Accumulator (LBA) models. Similar to HDDM, the distance to threshold (parameter A in LBA models) and the non-decision time parameters increased under DAMGO. All three sequential sampling models showed common effects of mu opioid stimulation. Asterisks: *** p < 0.001, ** p < 0.01, * p < 0.05. Download Figure 10-1, TIF file.
Figure 10-2
HDDM models for single-offer trials in the DAMGO experiments. For single-offer trials, there were decreasing effects of DAMGO on drift rate (A) and starting point bias (C), and increasing effects on threshold (B) and non-decision time (D). The largest effect of DAMGO based on Bayesian credible intervals was on starting point bias, which shifted to negative values, indicating that the rats needed more information than usual to detect the rewarded stimuli. Asterisks: *** p < 0.001, ** p < 0.01, * p < 0.05. Download Figure 10-2, TIF file.
DDMs fit using the rlssm package also found increased threshold and non-decision time following μ-opioid stimulation of the PLC (Extended Data Fig. 10-1A). The Bayesian estimate of the mean threshold from the control sessions was 1.434 [credible interval (CI), 1.339–1.539] and 1.565 (CI, 1.428–1.700) from the DAMGO sessions. Fewer than 3% of posteriors for threshold from the DAMGO sessions were smaller than from the control sessions (p = 0.029). The Bayesian estimate of the mean non-decision time from the control sessions was 0.230 [credible interval (CI), 0.197–0.268] and 0.272 [CI, 0.243–0.305] from the DAMGO sessions. Fewer than 1% of posteriors for non-decision time from the DAMGO sessions were smaller than from the control sessions (p = 0.001). No effects of μ-opioid stimulation were found for drift rate or starting point bias using the DDM models in rlssm.
DAMGO also impacted threshold and non-decision time in the LBA models (Extended Data Fig. 10-1B). The Bayesian mean for distance to threshold was 0.703 (CI, 0.350–1.111) for the control sessions and 1.249 (CI, 0.666–1.903) for the DAMGO sessions. Fewer than 4% of posteriors for non-decision time from the DAMGO sessions were smaller than from the control sessions (p = 0.0352). The Bayesian mean for non-decision time was 0.090 (CI, 0.048–0.137) for the control sessions and 0.178 (CI, 0.134–0.223) for the DAMGO sessions. Fewer than 0.1% of posteriors for non-decision time from the DAMGO sessions were smaller than from the control sessions (p = 0.0001). There were no differences in drift rate or initial evidence in the LBA models. Taken together, these findings establish that μ-opioid stimulation of the PLC slows decision and non-decision processes on dual-offer trials across all three types of sequential sampling models.
For the single-offer trials, HDDM models found that all four parameters differed between the control and DAMGO sessions (Extended Data Fig. 10-2), with drift rate reduced, threshold increased, starting point bias reduced, and non-decision time increased under DAMGO. We note that starting point bias was reduced for both the single- and dual-offer trials by DAMGO. This finding suggests that μ-opioid stimulation reduced the preferences of the rats for the higher-value stimulus.
Interpretational issue: testing with a reversed luminance–reward mapping
As the main task used a parametric mapping between stimulus luminance and sucrose concentration, an additional experiment was conducted to determine if the luminance of the stimuli might have been a factor in how the rats learned to choose between the stimuli. Six male rats were trained using a reversed (negative) mapping of stimulus luminance and reward value (two LEDs → 16% sucrose and eight LEDs → 4% sucrose). In the first session of testing with single and dual offers, these rats showed generally lower detection rates of the stimuli on single-offer trials compared with the main study (Fig. 11A), as shown in Figure 2 and Extended Data Figure 4-1. The detection rate for the low-luminance/high-value stimulus was 0.8241 (CI, 0.7168–0.9314). The detection rate for the high-luminance/low-value stimulus was 0.8981 (CI: 0.7986–0.9976). The rats’ choice preferences for the low-luminance/high-value stimulus were notably lower (mean, 0.5852; CI, 0.5439–0.6265) than those reported in Figure 4B. Finally, as shown in Figure 11B, the rats showed no clear pattern of latency differences by reward value/luminance (low-luminance/high-value stimulus: mean, 0.6349 s; CI, 0.4449–0.8250 s; high-luminance/low-value stimulus: mean, 0.6524 s; CI, 0.4866–0.8183 s). This experiment demonstrates that the salience of the stimuli was a factor in determining how the rats detected and chose stimuli in the main experiment reported in this paper.
Discussion
We investigated the influence of the rat prelimbic cortex (PLC) on decision-making. We inactivated the PLC with muscimol and stimulated cortical opioid receptors with the μ-agonist DAMGO. Our goal was to understand how the PLC affects two key parameters of drift diffusion models: the decision threshold and the starting point bias. These DDM parameters were associated with the human medial prefrontal cortex in two neuroimaging studies by Domenech and Dreher (2010) and Mulder et al. (2012). Our findings show that changes in PLC activity affect response speed but not choice preference. These results could be due to perceptual or value-based factors or a combination of the two. Computational modeling revealed that perturbations of neural processing in PLC specifically impacted the decision threshold, the amount of information needed to make a choice. Together, our results suggest that the rat PLC acts as an inhibitory brake on the decision-making process.
In training the animals to perform the behavioral task, we observed sex differences in how male and female rats responded to single offers of the task stimuli and made choices between them. Female rats made faster decisions and maintained consistent performance and models of the decision process during choice learning. In contrast, male rats initially made slower decisions but improved over time, showing decreases in their decision thresholds during choice learning.
In this discussion, we address the three main findings of our study: sex differences in learning and performance, the role of the PLC in the decision-making process, and the impact of cortical μ-opioid receptors on decision-making.
A potential sex difference in learning decision-making tasks
During the initial stage of training, as the rats learned that the more salient stimulus was paired with the higher concentration of sucrose, all animals showed reduced response latencies over time, with females consistently exhibiting shorter latencies than males (Fig. 2B). This difference in performance persisted into choice learning. Using drift diffusion models (DDMs), we explored potential sex differences in the decision process. Both sexes showed reduced decision thresholds on single-offer trials during cue learning (Fig. 3C,D). However, this reduction was significant for females only with the low-value stimulus. Additionally, starting point bias exhibited a clear sex difference, where females showed a negative bias toward the low-value stimulus (Fig. 3E). These findings suggest that while both sexes improved in stimulus detection, females had to overcome an initial bias against the low-value stimulus.
When tested with choices between stimuli, females maintained a consistent model for the decision process throughout choice learning (Fig. 5E). In contrast, males showed reductions in the decision threshold with choice learning (Fig. 5F), which paralleled reductions in their choice latencies (Fig. 4B, rightmost plot) and exponential variability (Fig. 4E). These differences might be due to the females having had a stronger preference for the high-value stimulus from the start of cue learning. Previous studies have shown that female rats prefer sweet fluids more than males (Sclafani et al., 1987; Reichelt et al., 2016), are less sensitive to low-value sucrose rewards (Curtis et al., 2004), and may be hyposensitive to rewards in general compared with male rats (Orsini and Setlow, 2017). However, Townsend et al. (2019) reported that female rats worked more for opioid drugs and natural rewards compared with male rats. Given the reward options in our task, this could have led to the females having higher incentive salience (Berridge and Robinson, 2016) for the high-value stimulus, as supported by the differences in starting point bias in the males and females in the DDM models (Figs. 3E,F, 5E,F, 6C) and by females showing quicker reward retrievals (Extended Data Fig. 2-1) but not longer licking bouts or other measures of subjective value during reward consumption (Extended Data Fig. 2-2).
The changes in the decision threshold for males (Fig. 5B) might reflect their higher levels of choice impulsivity (Hamilton et al., 2015; Cho et al., 2018). Despite not showing higher levels of general impulsivity (e.g., detecting task stimuli as well as the females, Fig. 2A and Extended Data Fig. 4-1), males demonstrated lower preferences for the high-value stimulus (Fig. 4A), slower choice latencies (Fig. 4B), and initially lower drift rates from the DDMs (Fig. 5A).
It must be pointed out that the differences reported here between the female and male rats could have been due to differences in body mass, and not sex. The male and female rats in this study were provided each day with different amounts of food (males, 14 g; females, 12 g) or fluid (for two of the males) to maintain their weights within the free-access weight range of 90–95%. Therefore, the rats were matched by their relative weight compared with free access, and not their absolute mass.
Inactivation of PLC speeds performance without affecting choice
Inactivation of the PLC resulted in faster response times for all animals (Fig. 7C), without impacting stimulus detection (Fig. 7A) or choice preference (Fig. 7B). Drift diffusion modeling revealed that the decision threshold decreased with PLC inactivation (Fig. 8B), whereas drift rate and non-decision time remained unaffected. This effect was particularly pronounced for high-value choices, as indicated by an increase in starting point bias with PLC inactivation (Fig. 8C). These findings were consistent across three different methods for modeling the decision-making process (Fig. 8 and Extended Data Fig. 8-1).
The reduction in decision threshold might be due to a decrease in inhibitory control, a function dependent on the rodent prefrontal cortex (for review, see A. Bari and Robbins, 2013). In this case, the reduction in inhibitory control was specific to the decision process. We observed no changes in detection rates on single-offer trials (Fig. 7A), no loss of selectivity in choosing the high-value option (Fig. 7B), and no general disruptions in task performance for most of the rats under the highest concentration of muscimol.
The effects reported here are specific to the inactivation of the PLC and, in some rats, the underlying ventral orbital area. It is plausible that targeting a different frontal region for reversible inactivation would yield different outcomes. For example, Hanks et al. (2015) and Erlich et al. (2015) observed generally slower performance in an evidence accumulation task following inactivation of the frontal orienting field, located in the medial agranular cortex adjacent to the PLC. Similarly, Vázquez et al. (2024) found an overall slowing of performance and specific effects on drift rate in DDM models after optogenetic inactivation of the perigenual ACC, located caudal to the PLC. Additionally, lesions of the orbitofrontal cortex (OFC), ventral and lateral to the PLC region studied here, led to reduced accuracy in the five-choice serial reaction time task, which was interpreted as decreased attention to task stimuli (Chudasama et al., 2003). Exploring these cortical regions using the current task design would be insightful. Inactivation of the FOF and ACC might yield opposite effects to those described here for PLC in terms of speed, while inactivation of the OFC might selectively influence choice preferences.
μ-Opioid stimulation of PLC slows performance without affecting choice
Opioid receptors, including μ-opioid receptors, are densely distributed within the rodent prefrontal cortex, specifically in the PLC (Lewis et al., 1983). However, the role of these receptors in cognitive tasks that depend on PLC processing has not been previously examined. μ-Opioid stimulation of the rat prefrontal cortex is known to enhance consummatory behavior (Mena et al., 2011, 2013; Giacomini et al., 2021, 2022) and trigger hedonic reactions during sucrose consumption (Castro and Berridge, 2017). These findings were reported from a cortical area referred to as the “medial OFC,” located in the lower half of the medial wall cortex anterior to the rhinal sulcus. In our study, the cannulae terminated in this area in nine of the rats (Fig. 1E), while other cannulae were placed more dorsally, where DAMGO infusions have been shown to not alter the consumption of liquid sucrose (White and Laubach, 2022).
DAMGO's effect in these medial regions involves activating parvalbumin interneurons (Lau et al., 2020), which disinhibits the local cortical region. Therefore, DAMGO's effects in the PLC were expected to contrast with those observed using reversible inactivation. Consistent with this expectation, DAMGO infusions resulted in opposite effects to those of reversible inactivation. Rats were slower to respond in both single- and dual-offer trials (Fig. 9C), with no impact on stimulus detection in single-offer trials (Fig. 9A) or choice preferences in dual-offer trials (Fig. 9B). DAMGO affected multiple parameters of the DDMs (Fig. 10), notably decreasing the starting point bias (Fig. 10C), indicating that rats required more information than usual to trigger choices of the high-value stimulus.
The local cortical μ-opioid stimulation effects observed in this study differ from those reported in a recent study of systemic opioid stimulation in humans (Eikemo et al., 2017). That study found an increase in starting point bias and drift rate under systemic opioid treatment, suggesting enhanced decision-making efficiency. The contrasting findings between local and systemic opioid stimulation underscore the complexity of opioid effects on cognitive processes. Further studies using animal and human models are needed to explore these differences.
Footnotes
We thank Dr. David Kearns and Dr. Yogita Chudasama for their helpful comments and Dr. Brian Baldo for his consultation regarding our use of the opioid drug DAMGO. This work was supported by a Pilot Award from the DC Center for AIDS Research and an NIH Grant to M.L. (1R15DA046375-01A1).
The authors declare no competing financial interests.
- Correspondence should be addressed to Mark Laubach at mark.laubach{at}american.edu.