Abstract
Primates, including humans, use stimulus–reward associations to guide foraging. We previously showed that both the rhinal cortex (Rh) and rostromedial caudate (rmCD) of rhesus monkeys play causal roles in assigning value to visual stimuli. Layer 5 neurons in Rh project to rmCD. Here, we reversibly interrupted this Layer 5 connection in two male monkeys by combining a unilateral Rh lesion with contralateral expression of an inhibitory DREADD delivered using a retrograde lentivirus (fusion glycoprotein type E) injected into rmCD. Interruption of projection neurons from Rh to rmCD had little effect on already learned stimulus–reward associations but impaired the learning of new associations. The learning impairment appeared when the projection neurons from the perirhinal cortex (PRh) to rmCD were silenced using microinjections of deschloroclozapine into PRh. The pathway-specific silencing led to a significant deficit in learning new stimulus–reward associations. These results suggest that the learning, but not retrieval, of visual stimulus–reward associations involves projection neurons from PRh to rmCD.
Significance Statement
Primates use stimulus-reward associations to guide foraging. Both the rhinal cortex (Rh) and rostromedial caudate (rmCD) play a causal role in visual stimulus–reward associations in nonhuman primates. There is a strong anatomical projection from Rh to rmCD. Reversible interruption of this connection—specifically perirhinal cortex neurons projecting to rmCD—had little effect on already learned stimulus-reward associations but impaired the learning of new associations. Our findings emphasize the significance of this circuit in adaptive behavior and indicate that learning and retrieval depend on different neural pathways.
Introduction
Primates, including humans, quickly learn to use visual cues to predict and maximize the acquisition of future rewards. This ability, presumably, relies on a neural network that associates visual stimuli with reward values. Rhinal cortex (Rh) is thought to be a critical node in the network of brain regions involved in stimulus–reward associations; anatomically, Rh receives highly processed sensory information from multiple modalities, including the strongest unimodal afferent from the inferior temporal cortex (Suzuki and Naya, 2014; Miyashita, 2019) which is at the end of ventral visual pathway. Rh is also densely connected with the reward processing network, i.e., the orbitofrontal cortex (OFC; Suzuki and Amaral, 1994; Lavenex et al., 2002), nucleus accumbens (Friedman et al., 2002), caudate (Yeterian and Van Hoesen, 1978; Van Hoesen et al., 1981; Saint-Cyr et al., 1990; Burwell et al., 1995; Choi et al., 2017; Griggs et al., 2017), amygdala (Stefanacci et al., 1996), and midbrain dopamine neurons (Berger et al., 1988; Richfield et al., 1989; Akil and Lewis, 1993). Thus, Rh has connectivity that can support the association of visual stimuli and reward. Ablation of Rh impairs the ability to use learned stimulus–reward associations to guide behavior (Gaffan and Murray, 1992; Buckley and Gaffan, 1997; Thornton et al., 1997; Clark et al., 2012), to learn stimulus–reward association reversals (Murray et al., 1998), and to learn stimulus–reward–schedule associations in nonhuman primates (Liu et al., 2000). Inactivation of D2 receptors in Rh induces deficits in learning to associate visual cues with progress through a reward schedule (Liu et al., 2004). The responses of many neurons in Rh to visual stimuli are modulated by reward (Mogami and Tanaka, 2006) or related to reward schedules (Liu and Richmond, 2000; Sugase-Miyamoto and Richmond, 2007).
The rostromedial caudate (rmCD) receives afferents from multiple key regions related to reward processing (Haber et al., 2006; Choi et al., 2017), i.e., OFC, dorsal anterior cingulate cortex, Rh, and amygdala. Chemogenetic inactivation of bilateral rmCD or functional disconnection of OFC and rmCD induces impairments in adjusting motivation based on learned visual stimuli (Nagai et al., 2016; Oyama et al., 2022). Direct silencing of the projections from OFC to rmCD impairs the experience-based updating of values associated with visual stimuli, but not the acquisition of new stimulus–reward associations, or the application of learned stimulus–reward associations (Oyama et al., 2024). Neural activity in rmCD reflects information regarding expected reward (Fujimoto et al., 2019).
To determine whether the Rh projections to rmCD are important for stimulus–reward associations, we combined a unilateral Rh lesion with contralateral expression of the inhibitory hM4Di DREADD in neurons projecting to rmCD (Fig. 1A). A similar method has been shown to be effective in our previous study (Eldridge et al., 2016). The DREADD was introduced using a retrograde lentivirus (fusion glycoprotein type E, FuG-E) injected into rmCD. Projection neurons from Rh to rmCD in the intact hemisphere were then silenced by activating the inhibitory DREADD via systemic or local delivery of an agonist [clozapine N-oxide (CNO; Armbruster et al., 2007) or deschloroclozapine (DCZ; Nagai et al., 2020)]. Projection neurons from the perirhinal cortex (PRh) to rmCD appear to play a central role in learning, but not retrieval, of visual stimulus–reward associations.
Experimental design and task. A, Experimental design. Injections of retrograde virus expressing hM4Di (yellow) in left rmCD combined with Rh lesion (black) to interrupt the projection neurons from Rh to rmCD when hM4Di receptors were activated by CNO or DCZ. B, Left top, Retrograde virus injections in rmCD indicated by contrast reagent Mn2+ in MRI images. Left bottom, GFP immunostaining around the injection sites in rmCD. Right, GFP immunostaining in Rh. GFP-positive cells indicate neurons projecting to rmCD. The white arrow indicates Rh sulcus. C, Top, Virus injection sites in left rmCD indicated by contrast reagent Mn2+ in MRI images (left) and strong [11C]DCZ binding regions in left rmCD showing in coregistered MR and PET images (right). Middle, Strong [11C]DCZ binding regions in the left caudal Rh cortex showing in coregistered MR and PET images. Bottom, [11C]DCZ bindings in left rmCD and spared Rh were significantly stronger than those in the right hemisphere for both monkeys. Monkey C, n = 9; four sections; Monkey P, n = 9; three sections for rmCD and spared Rh, respectively. Data are represented as mean ± SEM. D, Lesion reconstruction of Rh removal in the left hemisphere. E, Events in a trial of the reward-size task with (purple) or without (cyan) correction. The differences between the two tasks were highlighted with color. *p < 0.05; **p < 0.01.
Materials and Methods
Subjects
Three adult male monkeys (Macaca mulatta) were used in the present study. Two of them, Monkey C (10.1 kg, 7 years old) and Monkey P (9.3 kg, 6 years old), participated in behavioral tests. Monkey A was used to confirm the projections from Rh to rmCD via postmortem immunohistochemistry. All experimental protocols conformed to the Institute of Medicine Guide for the Care and Use of Laboratory Animals and were performed under an Animal Study Proposal approved by the Animal Care and Use Committee of the National Institute of Mental Health.
Viral vector production
The retrograde lentivirus was produced by packaging the Lenti-hSyn::hM4Di-CFP/GFP plasmid (Eldridge et al., 2016) with the FuG-E (Kato et al., 2014) envelope in 293T kidney cells. Concentrated lentivirus particles were suspended in PBS at a titer of 1.5 × 109 iu/mL and stored at −80°C in 10 µl aliquots.
Surgeries
Surgeries were performed under aseptic conditions in a fully equipped operating suite with veterinary supervision. Animals were sedated with ketamine (10 mg/kg, i.m.) before surgery. During surgery anesthesia was maintained with isoflurane (1–4%, to effect). Body temperature, heart rate, blood pressure, SpO2, and expired CO2 were monitored throughout all the procedures. From the evening prior to the surgery to 5–7 d after surgery, antibiotics were administered to monkeys.
Aspiration lesions of Rh have been described in detail previously (Meunier et al., 1993; Clark et al., 2012). In the current study, Rh was removed from the right hemisphere of both monkeys (Fig. 1D). For the virus injection to rmCD, magnetic resonance imaging was performed before the surgery to estimate the stereotaxic coordinates of injection sites. To access rmCD, the skin, fascia, and temporalis muscle were reflected, a bone flap was removed, and the dura was opened to expose the cortex. Virus mixed with MRI contrast reagent manganese MnCl2•4H2O [Mn2+, (0.1 mM)] was loaded into a glass syringe (100 μl, Hamilton) which was mounted on a Nanomite pump (Harvard Apparatus). The needle of the syringe was sheathed with a silica capillary (450 μm OD) to create a ∼1 mm step from the needle tip to minimize backflow. The needle was lowered into rmCD in the left hemisphere through an incision in the dura matter. Seven virus injections (60 μl in total) were made into two tracks 2 mm apart along the rostral–caudal axis. The needle remained in place for 10 min after the last injection in each track to reduce backflow. After surgery, MRI was performed to localize the virus injection locations. For Monkey P, the initial injections were medial and ventral to the intended target. This led us to give a second set of injections, which covered the intended rmCD target. Behavior tests were started at least 6 weeks after injection to allow time for maximal expression (Nagai et al., 2016).
The surgeries for chamber installation were performed 13 and 23 months after the preceding surgery for Rh aspiration or virus injection, respectively, for Monkeys P and C. Bone flaps temporarily removed in preceding surgeries had reintegrated with the skull by the time of chamber installation for both monkeys.
Behavioral tasks
Monkeys sat in a primate chair facing a computer monitor in a darkened, sound-attenuated chamber. The behavioral task was controlled by the real-time experimental system “REX” (Hays et al., 1982). Visual stimuli were presented by commercially available software (Presentation, Neurobehavioral Systems).
Monkeys were initially trained to press and release a touch-sensitive bar to get fluid reward. After this initial shaping, a red/green discrimination task was introduced. To initiate a trial, monkeys had to touch a touch-sensitive bar. After 100 ms, a small red square (0.5 × 0.5°) was presented in the center of the screen. To obtain the liquid reward, monkeys needed to continue to press the bar until the red square turned to green, which happened randomly within 500–1,500 ms after the red square appeared. The response window was 200–1,000 ms after the green square appeared. Releasing the bar within the response window was followed by visual feedback (a blue square). Reward delivery occurred 200–400 ms after the visual feedback appeared. There was a 1 s intertrial interval regardless of the outcome in the trial.
After Monkey P reached criterion in the red/green task (correct rate ≥85% in 2 consecutive days), a visually cued reward size with correction task was introduced (Fig. 1E). Each trial began after monkeys pressed the bar. A visual cue (10 × 10°) chosen from one of four possibilities appeared in the center of the monitor 100 ms after the bar touch. After 500–750 ms, a red square (0.5 × 0.5°) appeared in the center of the visual cue. Monkeys were required to hold the bar until the red square turned to green (500–1,500 ms). To obtain the reward, monkeys needed to release the bar 200–1,000 ms after the green square appeared. Releases outside this time window were counted as errors. An error was followed by repetition of the current trial until the trial was completed correctly. Correct bar releases were followed by a blue square and after 200–400 ms one, two, four, or eight drops of water reward (200 ms interdrop interval). The intertrial interval was 1 s. Monkeys performed this task for 90 min each session.
Monkey P was then switched to the reward-size task without correction trials. Monkey C was started on the reward-size without correction task directly after red/green discrimination training. For the reward-size-without-correction task, the duration of red square was elongated to 2,000–3,000 ms (Fig. 1E). In this task, when monkeys released the bar, the next trial was a new random trial whether the trial had been accepted or not. Other parameters were the same as the reward-size task with correction trials (Fig. 1E). Monkeys were presented with four or eight familiar cues or four new cues for 70 min in each session.
For Monkey P, both familiar (>2 weeks) and new cues were tested (reward-size task without correction trials) before and after the first surgery (virus injections in rmCD; Extended Data Fig. 1-1). For Monkey C, only familiar cues (reward-size task without correction trials) were tested before, not after, the first surgery (Extended Data Fig. 1-1, rhinal lesion). Monkeys performed tasks for 5–6 d per week. Two systemic or local injections (one for DCZ and one for saline; order was counterbalanced across weeks) were performed each week.
Drugs and injections
CNO (RTI International) was dissolved at 66 mg ml−1 in 100% DMSO and then diluted with PBS to produce a final concentration of 10 mg/ml CNO in 15% DMSO (v/v). A 10 mg/kg CNO was intramuscularly injected for some of the behavior tests for Monkey M (Extended Data Fig. 1-1). DCZ was dissolved in saline with HCl to a concentration 1 mg/ml and then neutralized with NaOH. For systemic injections, low-dose DCZ (0.1 mg/kg) was intramuscularly injected. A matched volume of saline was used as the vehicle control. Behavioral performance was tested 30–60 min after intramuscular injections.
For local microinjections in PRh or entorhinal (ERh), DCZ solution was further diluted to 100 nM with saline. To localize PRh and ERh, a grid filled with MR contrast agent (gadolinium, fixed with gelatin) was placed in the chamber during MR imaging. The coordinates of four injection sites (2 mm apart) in PRh and ERh were calculated based on this chamber-grid system and verified with MRI scans with manganese (Mn2+, 0.5 mM, 3 μl/site) injections (Fig. 3C). For Monkey C’s injections, 24 gauge stainless steel guide tubes were inserted through the grid holes to penetrate the dura and stopped ∼5 mm above the target regions, which allowed 31 gauge stainless steel infusion cannulas to reach the target brain regions accurately. Cannulas, guide tubes, and the grid were removed from the chamber after injections. For Monkey P, a semichronic procedure was used (Tang et al., 2024). Polyimide guide tubes (glued to the grid) and stainless steel stylets were left in the brain across injection days. The chamber was filled with silicone sealant. DCZ or saline was loaded into four 100 μl glass syringes (Hamilton) which were connected to the infusion cannulas and then pumped (Nanomite pump, Harvard Apparatus) into PRh or ERh at a rate of 0.18 μl/min for 17 min (∼3 μl). Infusion cannulas were left in place for 15 min after injections. Monkeys were tested immediately after the infusions. DCZ solution was filtered with a 0.2 μm filter before any systemic or local injections. For clinical reasons, we were not able to perform local infusions in ERh in Monkey C.
Positron emission tomography (PET) imaging
To visualize the expression of hM4Di, we conducted PET imaging after virus injections (after ∼9 months for Monkey C and ∼31 months for Monkey P). Monkeys were anesthetized with ketamine (10–15 mg/kg, i.m.) and dexmedetomidine (0.005–0.05 mg/kg, i.m.) before PET scan. Anesthesia was maintained with isoflurane during the PET imaging procedure. PET scans were conducted with a microPET Focus 220 scanner (Siemens Medical Solutions) for Monkey C or Mediso LFER 150 PET/CT (Mediso Medical Imaging Systems) for Monkey P. Dynamic PET scans were acquired for 90 min after an intravenous bolus injection of [11C]DCZ (333 MBq). [11C]DCZ was made as previous (Yan et al., 2021).
Behavioral analysis
To evaluate the learning process, we calculated the refusal rate in nonoverlapped five-trial-bins for each stimulus–reward association. Data were truncated for each stimulus–reward association based in the minimal number of trials across four associations and across all the saline and DCZ treatment days. For stimulus–reward association task without correction trials, time difference between bar release and cue onset in the refused trials was defined as reaction times for two smaller rewards. Time difference between bar release and green square onset in accepted trials was defined as reaction times for two larger rewards. To evaluate the dynamics of reaction times for two smaller or larger reward trials during learning, we averaged the reaction times in nonoverlapped five refusal or accepted trials, respectively. For stimulus–reward association task with correction trials, reaction times for all the reward sizes were defined as the time difference between bar release and green square onset in correct trials.
ANOVA models were used to analyze reaction times, error rates and refusal rates. P values were adjusted by Bonferroni’s correction for multiple comparison. The unpaired t test was used to compare total reward or total initiated trials between saline and DCZ treatment days.
Simulation of strategies
To evaluate different strategies in the reward-size task without correction trials, we simulated several strategies with different refusal rates on the four reward-size task (Extended Data Fig. 2-1A). Reaction times when monkeys refused two smaller rewards and accepted two larger rewards were from the two monkeys’ average performance in the systemic saline treatment days. For reaction times when two smaller rewards were accepted or two larger rewards were refused, because monkeys rarely made such decisions, we instead used average reaction times when monkeys accepted two larger rewards or refused two smaller rewards, respectively. Other events in the simulation were the same as the design in Figure 1E. Simulations were run for 30 min, and cumulative reward was calculated every minute.
PET imaging analysis
PET imaging data were reconstructed using Fourier rebinning plus two-dimensional filtered back projection (Focus 220) and using 3D-OSEM (LFER 150), both with attenuation and scatter correction. To estimate the specific binding of [11C]DCZ, regional binding potential relative to nondisplaceable uptake (BPND) was calculated using the PMOD software (PMOD 4.4) and a multilinear reference tissue model, with the cerebellum as a reference. PET images and MRI images were coregistered and overlaid using PMOD for evaluation. rmCD (nine sections rostral to anterior commissure from +19 to +23 in 0.5 step for both monkeys) and the spared most caudal Rh (four sections from +9.5 to +11 for Monkey C, three sections from +9.5 to +10.5 for Monkey P) in both hemispheres were outlined on MRI images. BPND signals in coregistered PET images were then subtracted and averaged within the outlines. BPND differences between the left and right hemisphere in each image were averaged and shown in Figure 1C bottom.
Histology
The monkeys were deeply anesthetized with Beuthanasia solution and perfused with 1 L of normal saline solution, followed by 3 L of 4% paraformaldehyde in 0.1 M PBS. The brains were removed and cryoprotected through a series of glycerols in 0.1 M PBS. The brains were blocked in the coronal plane and then quickly frozen in – 80°C isopentane. The brain was sectioned into 40 μm slices in the coronal plane and the sections collected in 10 series. For hM4Di-GFP immunofluorescence, sections were blocked in 5% normal goat serum (v/v) and 0.3% Triton X-100 (v/v) in Tris-buffered saline, incubated with rabbit anti-GFP primary antibody (1:10,000, Abcam AB290) in blocking buffer, washed and incubated with a biotinylated goat anti-rabbit IgG antibody (1:500, Vector Laboratories BA-1000). For NeuN immunofluorescence, sections were blocked as above, incubated with mouse anti-NeuN primary antibody (1:1,000 Chemicon MAB377), washed and detected with Alexa Fluor 647 goat anti-mouse IgG (1:500 Life Technologies A-21422). Brain sections were imaged using a slide scanning microscope (Olympus VS200).
Results
Chemogenetic interruption of Rh to rmCD projections
Projections from Rh to rmCD have been revealed using anatomical tract-tracing (Choi et al., 2017). To confirm that our retrograde virus also reveals this projection, a nonreplicating pseudotyped retrograde lentiviruses (FuG-E hSyn::hM4Di-GFP) were unilaterally injected into the rmCD of one monkey (Monkey A). MnCl2•4H2O (Mn2+, [0.1 mM]) was mixed with the suspension of viral vector just before injection, to allow visualization of the injection sites after surgery using MR imaging (Fig. 1B, left top). GFP-positive neurons were observed almost exclusively in Layer 5 neurons of Rh (Fig. 1B, right).
To investigate the role of this projection in stimulus–reward associations, two additional rhesus monkeys, Monkey C and Monkey P, received unilateral injections into the left rmCD (Fig. 1C, top) of nonreplicating pseudotyped retrograde lentiviruses carrying the gene for the inhibitory DREADD, hM4Di (FuG-E hSyn::hM4Di-CFP; Kato et al., 2014). In a separate surgery, the monkeys received aspiration lesions of Rh of the right hemisphere. The order of the two surgeries was counterbalanced across two monkeys.
In vivo expression of hM4Di was visualized via PET imaging with DREADD-selective radioligand 11C-labeled DCZ (Nagai et al., 2016; Yan et al., 2021). In both monkeys, increased signal was observed in left rmCD and Rh (Fig. 1C) when compared with a corresponding region in the contralateral hemisphere (the most caudal Rh in the right hemisphere was spared from the aspiration removal for both monkeys; rmCD, p = 2.52 × 10−9; 6.02 × 10−5, for Monkeys C and P, respectively; Rh, p = 0.03; 1.61 × 10−5 for Monkeys C and P, respectively; paired t test), which indicated successful expression of hM4Di in Rh neurons projecting to rmCD. The boundaries of the aspiration lesions were reconstructed after surgery from MRI images. Rh removals were as intended (Fig. 1D; monkey C, Rh, 66.8%; PRh, 67%; ERh, 66%; monkey P, Rh, 74.5%; PRh, 85.9%; ERh, 62.2%). Both monkeys showed partial damage to the amygdala (4.2 and 29.5% for Monkeys C and P, respectively).
We first injected CNO or DCZ systemically, presumably suppressing activity in all DREADD-expressing neurons projecting to rmCD in the intact hemisphere, including the Rh neurons projecting to rmCD, making this the only pathway bilaterally compromised.
Chemogenetic interruption of Rh neurons projecting to rmCD had little effect on learned stimulus–reward associations
To initiate a trial, monkeys were required to touch a bar, causing one of four pictures to appear. A small red square appeared over the picture 500–750 ms later. After a random interval, 500–1,500 ms, the red square turned green. If the monkey released the bar 200–1,000 ms after the green square appeared, a liquid reward was delivered. The picture that appeared indicated the size of the reward to be delivered (Fig. 1E). Both monkeys practiced this task in at least 12 sessions spread over 3 weeks. Trials in which monkeys released the bar before (early error) or after (late error) the response window were deemed as error trials. In this initial version of the task, errors were followed by correction trials, i.e., the same stimulus was shown repeatedly until the response was correct.
As seen previously, error rates were higher for smaller offers (Minamimoto et al., 2009; Eldridge et al., 2016), especially in the late phase of the session presumably when the monkeys were becoming satiated (Fig. 2A). Systemic delivery of CNO (10 mg/kg, i.m., monkey C) or DCZ (0.1 mg/kg, i.m., monkey P) did not change monkeys' error rates across the four reward sizes (Fig. 2B; p(treatment) = 0.48, 0.44; p(treatment × reward) = 0.46, 0.49, for Monkeys C and P, respectively; two-way ANOVA). For both monkeys, interruption of Rh to rmCD projections by systemic DCZ/CNO treatments had no effect on satiation-dependent changes in error rates (p > 0.05 for all comparisons, two-way ANOVA). For both monkeys, reaction times were shorter for larger offers. Systemic DCZ/CNO administration did not affect the reaction times of monkey C (p(treatment) = 0.95; p(treatment × reward) = 0.79; two-way ANOVA), but significantly shortened those of Monkey P (p(treatment) = 7.46 × 10−4; p(treatment × reward) = 0.58, two-way ANOVA; Table 1). Total reward earned and total trials initiated were not affected by systemic DCZ/CNO treatment (p > 0.05 for all comparisons; unpaired t test).
Chemogenetic interruption of Rh to rmCD projections did not impair monkeys’ performance with learned stimulus–reward associations. A, All trials from an example session with systemic saline treatment, the reward size with a correction task, familiar stimulus–reward associations. Black, red, and cyan dots indicated correct trials, early error trials (bar releases before the response window) and late error trials (bar releases after the response window), respectively. B, Error rates were not affected by systemic CNO (Monkey C) or DCZ (Monkey P) treatments. C, Same as A but for the reward size without a correction task. Black, red, and cyan dots indicated accepted trials, refused trials (bar releases before the response window) and late error trials (bar releases after the response window), respectively. D, Refusal rates were not affected by systemic DCZ treatment. Blue and red lines indicated systemic saline and DCZ treatment, respectively. All data are represented as mean ± SEM.
Reaction times (ms) for familiar cues in tasks with and without correction trials
In the reward-size task with correction trials described above, monkeys could not make an explicit choice based on stimulus–reward associations. At this point, we removed the correction trials in the reward-size task, converting the task into a temporal choice task: release before green to refuse the offered reward; release on green to accept. In each trial, a randomly chosen cue was presented no matter what the outcome of the previous trial. As expected, the monkeys refused cues associated with small rewards and accepted cues associated with large rewards (Fig. 2C), as seen previously in a similar task (Falcone et al., 2019; Wang et al., 2023).
To test whether the monkeys adopted a strategy that maximized the rate of fluid accumulation, we simulated the outcomes of several possible strategies (Extended Data Fig. 2-1A). The strategy monkeys adopted was to accept two larger rewards and refuse two smaller rewards. Simulations show that this strategy maximized the reward rate for this set of rewards (Extended Data Fig. 2-1B).
With familiar cues, which were presented to monkeys for at least 2 weeks, performance accuracy on the “explicit choice” task was not significantly affected by systemic DCZ treatment (Fig. 2D; p(treatment) = 0.35, 0.991; p(treatment × reward) = 0.138, 0.322; for Monkeys C and P, respectively; two-way ANOVA). Monkeys accepted a limited number of the two smaller-reward trials at the beginning of the session, which might indicate foraging or relearning the familiar cues. Systemic DCZ treatment did not impair this “start-up” behavior (Extended Data Fig. 2-1C; p(treatment) > 0.05; p(treatment × trials) > 0.05 for all comparisons; two-way ANOVA). For Monkey P, we increased the memory load by presenting the monkey with two sets of familiar visual cues (two visual stimuli associated with each reward size). Again, there was no significant effect of systemic DCZ treatment (p(treatment) = 0.85; p(treatment × reward) = 0.99; two-way ANOVA).
Both the total reward earned and the total trials initiated were not affected by DCZ treatment (p > 0.05 for all comparisons; unpaired t test). DCZ treatment significantly increased the reaction times when Monkey C refused the two smaller rewards (time difference between bar release and cue onset,; p(treatment) = 0.003; p(treatment × reward) = 0.997; two-way ANOVA) and reduced the reaction times when Monkey P accepted two larger rewards (time difference between bar release and green square onset; p(treatment) = 0.015; p(treatment × reward) = 0.962; two-way ANOVA; Table 1). These reaction time effects were not nonspecific effects of DCZ or CNO, because DCZ or CNO did not have significant effects on corresponding reaction times before any surgeries (p(treatment) = 0.143, 0.12; p(treatment × reward) = 0.818, 0.83 for Monkey C and P, respectively, two-way ANOVA; Table 1). However, after expression of retrograde virus in rmCD of Monkey P (and before rhinal lesion), systemic DCZ treatment significantly reduced reaction times when Monkey P accepted the two larger rewards (p(treatment) = 0.005; p(treatment × reward) = 0.727; two-way ANOVA; Table 1). For Monkey C, the order of surgeries precluded equivalent control tests. This confound made a clear explanation of reaction time changes caused by systemic DCZ injections difficult, leading us to focus on the reaction time changes introduced by local DCZ injections in the following section.
The results from both versions of the stimulus–reward association task demonstrate that interruption of Rh to rmCD projections has little effect on monkeys' retrieval of learned stimulus–reward associations.
Chemogenetic interruption of Rh neurons projecting to rmCD impaired learning of new stimulus–reward associations
To test whether Rh projections to rmCD might be needed for the learning of new stimulus–reward associations, we presented a new set of four stimuli each day in the temporal choice task. After sampling a few times, both monkeys quickly began to refuse trials in which a cue associated with either of the two smaller rewards appeared (Fig. 3A). As the monkeys became familiar with the smaller-reward associations, their reaction times decreased when the monkeys refused the smaller rewards (Monkey C, p = 0.004; 1.6 × 10−5; monkey P, p = 0.011, 0.104, for one- and two-drop reward in days with saline treatment, respectively; one-way ANOVA). Monkeys accepted the trials in which the two larger rewards were offered from the beginning of the session, and their reaction times to accept these trials were stable during learning (p > 0.05 for four- and eight-drop reward in both monkeys; one-way ANOVA). Thus, the effect of learning was only seen in the performance for the two smaller rewards, not the two larger rewards.
Chemogenetic interruption of Rh to rmCD projections impaired monkeys’ performance with new stimulus–reward associations. A, All trials from an example session with systemic saline treatment, the reward size without a correction task, new stimulus–reward associations. Black, red, and cyan dots indicated accepted trials, refused trials, and late error trials, respectively. B, Learning process for associations with two smaller rewards (1 and 2 drops) from systemic DCZ (red) and saline (blue) treatment days, evaluated by refusal rates. C, Mn2+ (0.5 mM, 3 μl) verified local infusion sites in PRh in T1 MRI images. Green lines outline PRh. D, Same as B but for local infusions. All data are represented as mean ± SEM. *p < 0.05; **p < 0.01; p values were adjusted by Bonferroni’s correction.
Systemic DCZ treatment significantly impaired the learning process for the two-drop reward for both monkeys (Fig. 3B; p(treatment) = 3.81 × 10−4; 5.6 × 10−5; p(treatment × trials) = 1, 0.028, respectively; for Monkeys C and P) but did not affect monkeys’ performance on the two larger reward offers (p > 0.05 for all comparisons; two-way ANOVA). This deficit cannot be attributed to a shift of strategy or valuation because no change was observed with familiar cues (Fig. 2C,D). This deficit was not caused by nonspecific effects of DCZ or unilateral interruption of interneurons in rmCD and neurons projecting to rmCD either, because no significant difference was introduced by systemic DCZ treatment before or after retrograde virus injections in rmCD for Monkey P (p > 0.05 for all comparisons; two-way ANOVA).
Above we showed that interruption of Rh to rmCD projections did not impair relearning of overtrained stimulus–reward associations (>2 weeks; Extended Data Fig. 2-1C). However, when we interrupted Rh to rmCD projections in Monkey P with systemic DCZ treatment shortly after the beginning of training on a new stimulus set (on the third day of training; no DCZ or saline treatments during the first 2 d; Extended Data Fig. 2-1D), there was a significant deficit in relearning to refuse the cue associated with smaller rewards at the beginning of the session (one drop; Extended Data Fig. 2-1D, bottom row; p(treatment) = 0.012; p(treatment × trials) = 6.8 × 10−7; two-way ANOVA). Thus, Rh to rmCD projections are involved not only in learning new stimulus–reward associations but also in the application of recently learned stimulus–reward associations.
Systemic DCZ treatment had no effect on total reward earned and total trials initiated for Monkey P (p = 0.176, 0.117 for total reward and trials, respectively; unpaired t test) whereas both decreased for Monkey C (p = 0.005, 0.002 for total reward and trials, respectively; unpaired t test). However, systemic DCZ treatment led to a decrease in total reward and total trials for Monkey C only when there were new cues. Thus, it appears that DCZ did not cause a general decrease in motivation or attention. The effects of systemic DCZ treatment on reaction times were not consistent across the two monkeys (Monkey P, p(treatment) > 0.05; p(treatment × trials) > 0.05 for all four rewards; Monkey C, p(treatment) = 1, 3.96 × 10−4, 1, 0.0035; p(treatment × trials) = 1, 1, 1, 1 for one-, two-, four-, eight-drop reward, respectively; two-way ANOVA).
The effects of systemic DCZ administration on learning could also be caused by the unilateral inactivation of neurons in brain regions other than Rh neurons projecting to the rmCD or local interneurons within rmCD. To investigate whether the DCZ was affecting behavior via PRh, ERh, both, or neither, we then made local microinjections of DCZ (3 μl × 4 sites [100 nM]) into PRh (Fig. 3C) or ERh (Extended Data Fig. 3-1C), which presumably selectively inactivate PRh or ERh neurons projecting to rmCD, as well as any collateral connections of the transduced PRh or ERh neurons. Consistent with systemic injection of DCZ, local microinjection of DCZ into PRh significantly impaired stimulus–reward association learning for one of the two smaller rewards (Fig. 3D; two-drop reward for Monkey C; p(treatment) = 0.0026; p(treatment × trials) = 1; one-drop reward for Monkey P; p(treatment) = 3.22 × 10−5; p(treatment × trials) = 8.28 × 10−10; two-way ANOVA. See Discussion about the impairment shift from two-drop to one-drop cues for Monkey P) and not the two larger rewards (Extended Data Fig. 3-1A; p(treatment) > 0.05; p(treatment × trials) > 0.05 for all comparisons; two-way ANOVA). In contrast, local microinjection of DCZ into ERh of Monkey P did not impair stimulus–reward association learning for either smaller or larger rewards (Extended Data Fig. 3-1D; p(treatment) > 0.05; p(treatment × trials) > 0.05 for all comparisons; two-way ANOVA).
We observed significant overall increases in reaction times for the small reward associations (Extended Data Fig. 3-1B, two-drop reward for Monkey C; p(treatment) = 0.0024; one-drop reward for Monkey P; p(treatment) = 0.0028; both p(treatment × trials) > 0.05; two-way ANOVA), the learning of which was impaired by local injections of DCZ into PRh, but not for the other small reward associations (Extended Data Fig. 3-1B, one-drop reward for Monkey C; two-drop reward for Monkey P; p(treatment) > 0.05; p(treatment × trials) > 0.05 for both comparisons; two-way ANOVA). The total reward earned and total trials initiated were not affected by local administration of DCZ (p > 0.05 for all comparisons; unpaired t test). Thus, the increase in reaction time during learning was not due to a general reduction in motivation or attention. Considering that local injections of DCZ into PRh avoid potential confounds associated with systemic DCZ injections, i.e., nonspecific effects of DCZ and unilateral interruption of interneurons in rmCD and neurons projecting to rmCD, we conclude that inactivation of PRh to rmCD projections impaired not only stimulus–reward association learning but also increased reaction times when the monkeys learned to refuse the cues associated with small reward. No significant effects in reaction times were introduced by local DCZ administration into PRh when monkeys accepted two larger rewards by releasing bar on green square (Extended Data Fig. 3-1B; p(treatment) > 0.05; p(treatment × trials) > 0.05 for all comparisons; two-way ANOVA). In contrast, for local injections in ERh, no significant changes were observed in reaction times for learning small reward associations (Extended Data Fig. 3-1E; p(treatment) = 0.932, 1; p(treatment × trials) = 1, 0.856, for one and two drops, respectively; two-way ANOVA), but reaction times decreased for two large reward associations when the monkey was required to wait 200 ms before releasing the bar on the green dot (Extended Data Fig. 3-1E; p(treatment) = 1.2 × 10−5, 0.002; p(treatment × trials) = 1, 1 for four and eight drops, respectively; two-way ANOVA).
Taken together, these findings suggest that the impaired stimulus–reward association learning observed after systemic injections of DCZ was selectively due to the inactivation of PRh to rmCD projections.
Discussion
There is a strong anatomical projection from the rhinal cortex to rostromedial striatum (Choi et al., 2017) and seen in Figure 1B. Each of these areas—the rhinal cortex and rostromedial striatum—plays a role in stimulus–reward associations (Gaffan and Murray, 1992; Buckley and Gaffan, 1997; Thornton et al., 1997; Murray et al., 1998; Liu et al., 2000; Liu et al., 2004; Clark et al., 2012; Nagai et al., 2016; Oyama et al., 2022). Here, we have shown that when the activity of PRh neurons in that projection is blocked using an inhibitory DREADD, monkeys are slow to learn the values associated with visual cues, something that they usually learn within a few trials. Thus, we have shown that the PRh Layer 5 projection specifically contributes to learning of stimulus–reward associations.
The accuracy of task performance was consistent across the two monkeys, with the exception that, for Monkey P, the impairment in learning shifted from the two-drop cues (Fig. 3B, right) to one-drop cues (Extended Data Fig. 2-1D; Fig. 3D, right) over time. This impairment shift was first observed during systemic injections, just before local injections in PRh were started (Extended Data Fig. 4-1). Therefore, the impairment shift was not caused by different injection conditions (systemic vs local). Performance subsequently stabilized, resulting in consistent effects between systemic and local injections (Extended Data Fig. 2-1D vs Fig. 3D, right). We interpret this shift as resulting from an internal change in strategy or valuation for Monkey P. Although the shift was unexpected, it does not alter our main conclusions, i.e., visual stimulus-association learning is impaired when PRh neurons projecting to rmCD are silenced.
We observed some inconsistency in the reaction times of the two monkeys after systemic DCZ administration (see Results and Table 1), which may have been caused by individual difference (e.g., difference in off-target effects of DCZ), counterbalancing of the surgical protocols, and/or different training histories. However, when projection neurons from the PRh to rmCD were selectively inhibited via local DCZ infusion, and when monkeys were extensively overtrained in the current task design, the effects on reaction time remained consistent. Specifically, only the reaction time for small reward associations—whose learning was impaired—showed a significant increase (Extended Data Fig. 3-1B).
DCZ was developed (Nagai et al., 2020) to minimize the potential for off-target effects. Systemic administration of low-dose DCZ (0.1 mg/kg) in nonhuman primates without DREADD appears to have no effect on resting-state functional connectivity or reaction time in a probabilistic learning task. High-dose DCZ (0.3 mg/kg) affects these measures, including increased reaction time in the probabilistic learning task (Fujimoto et al., 2022). However, low-dose DCZ (0.1 mg/kg) was observed to decrease the reaction time in our deterministic stimulus–reward association learning task without hM4Di expression in one monkey (Monkey P; p(treatment) < 0.01 for all four rewards; p(treatment × trials) > 0.05 for all four rewards; two-way ANOVA). Future NHP studies should test DCZ for off-target effects prior to chemogenetic receptor expression.
After bilateral Rh removal monkeys have difficulty with the application of learned stimulus–reward associations (Gaffan and Murray, 1992; Buckley and Gaffan, 1997; Thornton et al., 1997; Clark et al., 2012). Studies of the acquisition of stimulus–reward associations show conflicting results. After damage to the rhinal cortex, four studies failed to find impairment on visual discrimination learning for food reward (Gaffan and Murray, 1992; Buckley and Gaffan, 1997; Thornton et al., 1997; Thornton et al., 1998). Two other studies found impairments on a reward schedule task after bilateral Rh removal (Liu et al., 2000) or inactivation of D2 receptor in Rh (Liu et al., 2004). Our results here show that a consistent deficit in learning is seen when the Rh projection to the striatum is selectively inactivated. There are at least two explanations for these discrepancies. First, in visual discrimination learning tasks, the correct response elicits the same amount of reward delivery on every trial, thus implying there is similar motivation across trials. In contrast, in our tasks, the motivation level varies across trials. Rh may be necessary when monkeys are learning how to adjust motivation levels based on visual cues. Second, more independent stimulus–reward associations needed to be encoded and distinguished in our tasks (six reward schedule states or four reward sizes) than in visual discrimination learning tasks (two reward conditions, reward or no reward). Rh may be necessary when stimulus–reward association learning is challenging.
An alternative explanation for our results is that the inactivation of PRh neurons projecting to the rmCD impaired visual perception, thereby slowing the acquisition of stimulus–reward associations. However, PRh is implicated in visual learning only when images exhibit a high degree of feature ambiguity or when large sets of images are used in a single day (Buckley and Gaffan, 1997; Buckley et al., 2001; Bussey et al., 2002, 2003). In our task, only four visual cues were used each day, consisting of distinct grayscale natural scenes without morphing, blending, or other forms of ambiguity generation. Therefore, it is unlikely that the observed impairment was due to deficits in visual perception.
The rostral caudate (i.e., caudate head, rostral to anterior commissure) is known to guide behavior based on flexible values, not stable values (Kim and Hikosaka, 2013), a result consistent with our findings that projections from PRh to rmCD (a subdivision of rostral caudate) are involved in the learning of new stimulus–reward associations, but not in the application of overtrained stimulus–reward associations.
Our results indicate that the PRh to rmCD pathway is important for stimulus–reward association learning, but its specific role may extend beyond the formation of these associations. One possibility is that PRh neurons do not merely encode stimulus–reward associations but also convey reward-related information to the caudate during learning, thereby influencing behavioral choices. From this perspective, the impairment observed under inactivation may not reflect a failure to form stimulus–reward associations per se but rather disrupt the behavioral expression of that learning—behavioral use of reward information. Additionally, the slower learning observed under inactivation could reflect a shift in learning mechanisms, wherein stimulus–response association formed by reinforcement learning, likely within the striatum, compensates for the compromised PRh-dependent stimulus–reward association learning. Future studies including neural recordings could further dissociate these learning strategies and clarify the distinct roles of PRh and striatal circuits in guiding reward-based adaptive learning and behavior.
The observation that projection neurons from PRh to rmCD, but not ERh to rmCD, were causally involved in learning visual stimulus–reward associations is consistent with the anatomy: PRh, not ERh, receives substantial direct input from ventral visual areas (Insausti and Amaral, 2008; Suzuki and Naya, 2014; Miyashita, 2019). After inhibition of ERh neurons projecting to rmCD, we did observe an earlier bar release when Monkey P was required to wait 200 ms before releasing the bar to accept an offer. When no waiting was required, i.e., bar release in the first interval to refuse an offer, the monkey’s bar release time was not affected. These results indicate that ERh neurons projecting to rmCD may be involved in timing behavior, which is consistent with the role of medial ERh in timing in rodent (Heys et al., 2020; Dias et al., 2021).
Dopamine is thought to be vital for reward-related learning (Wise, 2004; Schultz, 2007; Bromberg-Martin et al., 2010). In Rh, the D2 receptor contributes to stimulus–reward schedule learning (Liu et al., 2004) and is distributed most densely in the deep layers (Richfield et al., 1989; Goldsmith and Joyce, 1996; Liu et al., 2004), the same layer where neurons projecting to rmCD are located. This colocalization may indicate Rh neurons projecting to rmCD are D2 receptor positive neurons.
Some of the Rh to rmCD projection neurons are bifurcating projection neurons. For example, >3% of Rh neurons projecting to rmCD also project to OFC (unpublished data). Thus, this small portion of projections from Rh to OFC would also likely be inhibited in the current study. Although the proportion of bifurcating projection neurons is small and, hence, unlikely to mediate the deficits observed in the present study, we cannot rule out their influence on behavior.
In sum, our results provide strong evidence that projection neurons from PRh to rmCD play a causal role in learning, but not retrieval of stimulus–reward associations. Our findings highlight the importance of this circuit in adaptive behavior and suggest that learning and retrieval rely on different neural pathways.
Footnotes
We thank Arya Mohanty for help in calculating the portion of Rh removed, Christine Chang for offering the data of bifurcating neurons, Alexander C. Cummins for help with histology. We thank the staff of the Veterinary Medicine and Resources Branch and the Central Animal Facility, NIMH, for animal care and support. We thank the staff of the Section on Instrumentation, NIMH, for comprehensive engineering support. Anatomical MRI scanning was carried out in the Neurophysiology Imaging Facility Core (NIMH, NINDS, NEI). This work was supported by the Intramural Research Program of the National Institute of Mental Health (project ZIAMH002619 (B.J.R.), ZIAMH002852 (R.B.I.) and ZIAMH002793 (V.W.P.)).
↵*W.W. and M.A.G.E. contributed equally to this work.
The authors declare no competing financial interests.
↵‡M.A.G.E.’s present address: Biosciences Institute, Newcastle University, Newcastle upon Tyne NE1 7RU, United Kingdom; T.S.’s present address: System Emotional Science, Faculty of Medicine, University of Toyama, Toyama, Japan; N.M.’s present address: School of Medicine & Health Sciences, George Washington University, Washington, DC 20052; J.E.P.’s present address: Department of Neurobiology, Harvard Medical School, Boston, Massachusetts 02115.
This paper contains supplemental material available at: https://doi.org/10.1523/JNEUROSCI.0491-25.2025
- Correspondence should be addressed to Barry J. Richmond at barryrichmond{at}mail.nih.gov.