Abstract
Temporal costs influence reward-based decisions. This is commonly studied in temporal discounting tasks that involve choosing between cues signaling an imminent reward option or a delayed reward option. However, it is unclear whether the temporal delay before a reward can alter the value of that option. To address this, we identified the relative preference between different flavored rewards during a free-feeding test using male and female rats. Animals underwent training where either the initial preferred or the initial less preferred reward was delivered noncontingently. By manipulating the intertrial interval during training sessions, we could determine whether temporal delays impact reward preference in a subsequent free-feeding test. Rats maintained their initial preference if the same delays were used across all training sessions. When the initial less preferred option was delivered after short delays (high reward rate) and the initial preferred option was delivered after long delays (low reward rate), rats expectedly increased their preference for the initial less desirable option. However, rats also increased their preference for the initial less desirable option under the opposite training contingencies: delivering the initial less preferred reward after long delays and the initial preferred reward after short delays. These data suggest that sunk temporal costs enhance the preference for a less desirable reward option. Pharmacological and lesion experiments were performed to identify the neural systems responsible for this behavioral phenomenon. Our findings demonstrate the basolateral amygdala and retrosplenial cortex are required for temporal delays to enhance the preference for an initially less desirable reward.
SIGNIFICANCE STATEMENT The goal of this study was to determine how temporal delays influence reward preference. We demonstrate that delivering an initially less desirable reward after long delays subsequently increases the consumption and preference for that reward. Furthermore, we identified the basolateral amygdala and the retrosplenial cortex as essential nuclei for mediating the change in reward preference elicited by sunk temporal costs.
Introduction
The temporal costs associated with earning a reward influences reward-based decisions across species (Phillips et al., 2007; Rangel et al., 2008). In temporal discounting tasks, subjects will reliably choose a small reward delivered after a short delay over a large reward delivered after a long delay (Logan, 1965; Ainslie, 1974; Rodriguez and Logue, 1988; Evenden and Ryan, 1996; Richards et al., 1997; Green and Myerson, 2004; Vanderveldt et al., 2016). These behavioral findings suggest the estimation of reward value is negatively influenced by the anticipated temporal costs associated with earning the reward (Green and Myerson, 2004; Vanderveldt et al., 2016). However, temporal discounting tasks typically involve a binary choice, where choosing one option comes at the expense of the alternative outcome (Logan, 1965; Ainslie, 1974; Rodriguez and Logue, 1988; Evenden and Ryan, 1996; Richards et al., 1997). As such, this approach cannot directly determine whether temporal costs impact the value of the chosen option and/or alter the value of the alternative option.
Here, we developed a rodent task to determine how temporal costs before a reward delivery subsequently alters reward value and reward preference. We first identified the relative preference between different flavored food rewards in a free-feeding test. Next, rats underwent training where the initial preferred and less preferred rewards were delivered noncontingently in separate training sessions. One food reward flavor was always delivered after a short delay while the other food pellet flavor was always delivered after a long delay. In this manner, we could determine how temporal costs imposed during training subsequently impacts reward value (consumption) and reward preference in a second free-feeding test.
Based off temporal discounting studies, one might expect that a reward delivered after a short delay will have a greater value because of the higher rate of reward experienced during training sessions (Khodadadi et al., 2014; Peterson et al., 2015; Vanderveldt et al., 2016). In support, we found that rats exhibited a greater preference for the initial less desirable option when this reward was delivered after short delays and the initial preferred reward was delivered after long delays. However, rats trained using the opposite contingencies did not exhibit a greater preference for the initial preferred reward. Rather, rats increased their consumption and preference for the initial less desirable reward, which suggests that sunk temporal costs can enhance the value of an originally less preferred option (Clement et al., 2000; Friedrich and Zentall, 2004; Alessandri et al., 2008). Pharmacological manipulations and lesion experiments were performed to identify the neural systems responsible for this behavioral phenomenon. We focused on neural systems involved with timing, reward valuation and learning, including the dopamine system (Wise, 2004; Berke, 2018), the orbitofrontal cortex (OFC) (Rolls, 2004; Izquierdo, 2017), the basolateral amygdala (BLA) (Wassum and Izquierdo, 2015), and the retrosplenial cortex (RSC) (Todd et al., 2015, 2019). Our data illustrate that the change in reward preference was unaffected by systemic dopamine receptor antagonism or OFC lesions. In contrast, lesions of the BLA or the RSC prevented the enhanced consumption and preference for the initial less desirable reward following long delay training sessions. These findings demonstrate that both the BLA and RSC participate in how temporal costs alter reward value and reward preference.
Materials and Methods
Subjects and surgery
All procedures were approved by the Institutional Animal Care and Use Committee at the University of Texas at San Antonio. Male and female Sprague Dawley rats (n = 105; 88 male, 17 female; Charles River) weighing 300–350 g were pair-housed on arrival and given ad libitum access to water and chow and maintained on a 12 h light/dark cycle.
All surgeries were performed under isoflurane anesthesia, and drug infusions were delivered at a rate of 0.1 µl/min. Surgical coordinates and injection volumes were based on prior research (Maren, 1999; McDannald et al., 2011; Powell et al., 2017). OFC-lesioned rats received injections of NMDA (12.5 µg/µl in saline vehicle; Tocris Bioscience) at the following locations (relative to bregma): 3.0 mm AP, ± 3.2 mm ML, −5.2 mm DV (0.05 µl); 3.0 mm AP, ± 4.2 mm ML, −5.2 mm DV (0.1 µl); 4.0 mm AP, ± 2.2 mm ML, −3.8 mm DV (0.1 µl); 4.0 mm AP, ± 3.7 mm ML, −3.8 mm DV (0.1 µl). BLA-lesioned rats received injections of NMDA (20 µg/µl) at the following locations: −3.3 mm AP, ± 4.6 mm ML, −8.6 mm DV (0.2 µl); −3.3 mm AP, ± 4.6 mm ML, −8.4 mm DV (0.1 µl). RSC-lesioned rats received injections of NMDA (20 µg/µl) at the following locations: −1.6 mm AP, ± 0.5 mm ML, −1.3 mm DV (0.26 µl); −2.8 mm AP, ± 0.5 mm ML, −1.3 mm DV (0.26 µl); −4.0 mm AP, ± 0.5 mm ML, −1.3 mm DV (0.26 µl); −5.3 mm AP, ± 0.5 mm ML, −2.0 mm DV (0.26 µl). All sham surgeries involved lowering the injector to the respective injection sites. Animals recovered for >1 week following surgery before beginning training.
Training
Rats were placed and maintained on mild food restriction (∼15 g/d of standard laboratory chow) to target 90% free-feeding weight, allowing for an increase of 1.5% per week. Behavioral sessions were performed in chambers that had grid floors, a house light, and two food trays on a single wall. In free-feeding sessions, plastic barriers were placed over the food trays. Additionally, a plastic insert was placed over the grid floors that contained two fixed cups in which the food pellets were placed. Experimental 45 mg sucrose pellets that had an identical nutritional profile but differed in flavor (chocolate flavor #F0025 and banana flavor #F0024; Bio-Serv) were placed in their home cages to minimize neophobia. Rats first underwent a free-feeding session (10 min) in which a single food pellet flavor was offered (6.5 g total). On the following day, rats underwent a second free-feeding session in which the alternate flavor was offered (ordering counterbalanced between animals). For the free-feeding preference test, rats were allowed 10 min to consume both chocolate and banana food pellets that were freely available in cups affixed to the floor. To ensure an ample supply of food, we provided 13 g of each flavor, which was 3 g higher than the maximal amount consumed in pilot studies. We identified which reward flavor was the Initial Preferred and the Initial Less Preferred based on the food consumed during this preference test.
Rats next underwent training sessions (1/d) in which one of the rewards was delivered for a total of 50 pellets per session. Food pellets were delivered noncontingently and were not preceded by reward-predictive cues during training sessions. In Short Delay sessions, one of the reward flavors was delivered after a 30 ± 5 s intertrial interval (ITI). In separate Long Delay sessions, the other reward flavor was delivered after a 60 ± 5 s ITI (Different Delay training). There were a total of 10 training sessions, which alternated between Long and Short Delay sessions with the first session counterbalanced between animals. Rats underwent a second free-feeding preference test following this training regimen. In a control experiment, rats were trained as described above, except that a 45 ± 5 s ITI was used for both the Initial Less Preferred and the Initial Preferred reward training sessions to ensure that rats were in the boxes for the same amount of total time as in the Different Delay training groups. In dopamine receptor antagonist experiments, injections of flupenthixol (225 µg/kg i.p., Tocris Bioscience) or saline were administered 1 h before training sessions, based on established research (Flagel et al., 2011). Flupenthixol was not administered before the preference tests.
Data analysis
The latency to respond was calculated as the length of time to make a head entry into the food port after a pellet was delivered. The latency measurement was capped at 25 s (shortest possible trial duration) to account for the rare occasion when a pellet was not retrieved before the subsequent trial. Anticipatory head entries were measured as the number of head entries into the food port during the 5 s preceding the reward delivery. The preference ratio was calculated as the amount consumed of the Initial Less Preferred reward relative to the total food consumed during the free-feeding test. A linear regression analysis was performed to determine how changes in food consumption related to changes in reward preference. We performed statistical analyses using GraphPad Prism 8. The effect of training on behavioral outcomes was analyzed using a paired t test or a mixed-effects model fit (restricted maximum likelihood method), repeated measures where appropriate, followed by a post hoc Sidak's test. The Geisser-Greenhouse correction was applied to address unequal variances between groups. The significance level was set to α = 0.05 for all tests.
Histology
Rats were intracardially perfused with 4% PFA, and brains were removed and postfixed for at least 24 h. Brains were subsequently placed in 15% and 30% sucrose solutions in PBS. Brains were then flash frozen on dry ice, coronally sectioned, and stained with cresyl violet to verify the location and spread of the surgical lesions.
Results
We developed a rodent task to examine how temporal costs subsequently impact reward value and preference. After identifying the relative preference between chocolate and banana flavored food pellets, rats underwent training sessions in which one reward was delivered after a Short Delay (30 ± 5 s ITI) and the other reward was delivered after a Long Delay (60 ± 5 s ITI) in separate sessions. We hypothesized that the higher rate of reward in Short Delay sessions would subsequently enhance the preference for the reward delivered in those sessions. In the first set of experiments, the Initial Less Preferred reward was delivered in Short Delay sessions and the Initial Preferred reward was delivered in Long Delay sessions (Different Delay training; Fig. 1A). Rats increased anticipatory head entries into the food port across training sessions, but there was no difference between Short and Long Delay sessions (two-way mixed-effects analysis; session effect: F(2.07,18.63) = 29.92, p < 0.0001; delay effect: F(1,9) = 0.01, p = 0.91; n = 10; Fig. 1B; Table 1). Rats also decreased the post-reward latency into the food port across training, with no difference between session type (two-way mixed-effects analysis; session effect: F(1.17,10.53) = 16.07, p = 0.002; delay effect: F(1,9) = 0.03, p = 0.87; Fig. 1C; Table 1). Although there were no behavioral differences between Short and Long Delay sessions, this training regimen increased the rats' preference for the Initial Less Preferred reward (paired t test: t(9) = 3.13, p = 0.01; Fig. 1D; Table 1), which was because of increased consumption of that reward option (post hoc Sidak's test: t(9) = 4.43, p = 0.003; Fig. 1E; Table 1). There were no sex differences in anticipatory responding, latency to respond, the change in preference, or reward consumption (Table 1). These data illustrate that the short temporal delay before the delivery of the Initial Less Preferred reward subsequently enhanced the value and preference for that reward option.
Increased preference for the Initial Less Preferred reward delivered after short delays. A, Training schematic for the Different Delay training sessions: Initial Less Preferred after short delays. B, Anticipatory head entries into the food port during the 5 s before reward delivery for the Initial Less Preferred (Short Delay) and Initial Preferred (Long Delay) training sessions. C, Latency to make a head entry into the food port after a reward is delivered for the Initial Less Preferred and Initial Preferred training sessions. D, Preference ratio plotted as the amount of the Initial Less Preferred food consumed out of the total food consumed during preference tests. E, Reward consumption for each flavor during the preference tests. F, Training schematic for the Same Delay training sessions. G, Anticipatory head entries into the food port during the 5 s before reward delivery for the Initial Less Preferred (Medium Delay) and Initial Preferred (Medium Delay) training sessions. H, Latency to make a head entry into food port after a reward is delivered for the Initial Less Preferred and the Initial Preferred training sessions. I, Preference ratio. J, Reward consumption for each flavor during the preference tests. *p < 0.05.
Statistical analyses for Figure 1
The change in reward preference could be because of the delays imposed during training sessions or alternatively could be because of increased exposure with the Initial Less Preferred reward. To address this possibility, we trained a separate group of rats as described above, except that a 45 ± 5 s ITI was used for both the Initial Less Preferred and Initial Preferred reward training sessions (Same Delay training; Fig. 1F). Rats increased anticipatory head entries into the food port across sessions, with no difference between session type (two-way mixed-effects analysis; session effect: F(2.83,16.99) = 37.89, p < 0.0001; delay effect: F(1,6) = 1.20, p = 0.31; n = 7; Fig. 1G; Table 1). There was no difference in the latency between sessions (two-way mixed-effects analysis; session effect: F(1.34,8.05) = 2.97, p = 0.12; delay effect: F(1,6) = 1.49, p = 0.27; Fig. 1H; Table 1). Rats undergoing the Same Delay training regimen did not alter their initial reward preference (paired t test: t(6) = 0.91, p = 0.40; Fig. 1I; Table 1), as there was no change in consumption of the Initial Less Preferred reward (post hoc Sidak's test: t(6) = 1.56, p = 0.31; Fig. 1J; Table 1) and an increased consumption of the Initial Preferred reward (post hoc Sidak's test: t(6) = 3.71, p = 0.02; Fig. 1J; Table 1). These experiments collectively demonstrate that the temporal delays during training sessions are responsible for altering reward preference.
In the next set of experiments, we examined whether the existing preference for the Initial Preferred reward could be further strengthened. To test this, a separate group of rats were trained to receive the Initial Less Preferred reward during Long Delay sessions and the Initial Preferred reward during Short Delay sessions (Fig. 2A). There was no difference in anticipatory head entries between Long and Short Delay sessions (two-way mixed-effects analysis; session effect: F(1.89,20.83) = 6.52, p < 0.01; delay effect: F(1,11) = 0.05, p = 0.83; n = 12; Fig. 2B; Table 2). There was also no difference in the latency to respond between session type (two-way mixed-effects analysis; session effect: F(1.32,14.48) = 54.17, p < 0.0001; delay effect: F(1,11) = 0.16, p = 0.70; Fig. 2C; Table 2). We anticipated rats would exhibit a stronger preference for the Initial Preferred reward, in line with the results from the first experiment (Fig. 1). However, we instead found that this training regimen increased the rats' preference for the Initial Less Preferred reward (paired t test: t(11) = 2.33, p = 0.04; Fig. 2D; Table 2) because of increased consumption of that reward option (post hoc Sidak's test: t(11) = 2.65, p < 0.05; Fig. 2E; Table 2). This cohort also exhibited an increased consumption of the Initial Preferred reward after training (post hoc Sidak's test: t(11) = 2.98, p = 0.03; Fig. 2E; Table 2). Given these unexpected findings, we performed further behavioral analyses to relate the change in preference to changes in food consumption. A stronger increase in preference toward the Initial Less Preferred reward was positively related to the consumption of that reward option (r2 = 0.96, p < 0.0001; Fig. 2F; Table 2) and inversely related to the consumption of the Initial Preferred reward (r2 = 0.38, p = 0.03; Fig. 2G; Table 2). Our results demonstrate that delivering the Initial Less Preferred reward after long delays subsequently enhances the preference for that option. Furthermore, this establishes a training procedure to examine how sunk temporal costs subsequently influence reward consumption and preference (Clement et al., 2000; Friedrich and Zentall, 2004; Alessandri et al., 2008).
Statistical analyses for Figure 2
Increased preference for the Initial Less Preferred reward delivered after long delays. A, Training schematic for the Different Delay training sessions: Initial Less Preferred after long delays. B, Anticipatory head entries into the food port during the 5 s before reward delivery for the Initial Less Preferred (Long Delay) and Initial Preferred (Short Delay) training sessions. C, Latency to make a head entry into the food port after a reward is delivered for the Initial Less Preferred and Initial Preferred training sessions. D, Preference ratio plotted as the amount of the Initial Less Preferred food consumed out of the total food consumed during preference tests. E, Reward consumption for each flavor during the preference tests. F, G, Linear regression relating the change in food consumption as a function of the change in the preference ratio. *p < 0.05.
We next sought to identify the neural systems responsible for sunk temporal costs increasing the preference for an initially less desirable reward. Perturbations of the dopamine system affect decisions in timing-related tasks (Cardinal et al., 2000; Wade et al., 2000; Koffarnus et al., 2011). Additionally, dopamine neurons contribute to reward learning, and encode the preference between options as well as changes in reward value (Fiorillo et al., 2003; Wise, 2004; Tobler et al., 2005; Gan et al., 2010; Steinberg et al., 2013; Lak et al., 2014; Fonzi et al., 2017; Berke, 2018). As such, we examined whether the change in reward preference could be prevented by administering systemic injections of flupenthixol before training sessions (Fig. 3A). Anticipatory head entries into the food port were disrupted by flupenthixol treatment across sessions (three-way mixed-effects analysis; session effect: F(3.02,66.43) = 22.26, p < 0.0001; treatment effect: F(1,87) = 11.22, p < 0.01; session × treatment effect: F(4,87) = 11.22, p < 0.0001; n = 12 saline, n = 12 flupenthixol; Fig. 3B; Table 3). Flupenthixol treatment also slowed the time to retrieve the reward during training sessions (three-way mixed-effects analysis; session effect: F(4,88) = 43.04, p < 0.0001; treatment effect: F(1,22) = 6.18, p = 0.02; Fig. 3C; Table 3). However, antagonizing dopamine receptors during training sessions did not prevent the increased preference for the Initial Less Preferred reward (two-way mixed-effects analysis; training effect: F(1,22) = 30.86, p < 0.0001; treatment effect: F(1,22) = 0.11, p = 0.74; interaction effect: F(1,22) = 0.77, p = 0.39; post hoc Sidak's test; saline: t(22) = 4.55, p < 0.001; flupenthixol: t(22) = 3.31, p < 0.01; Fig. 3D; Table 3). Both saline- and flupenthixol-treated rats increased the consumption of the Initial Less Preferred reward (post hoc Sidak's test; saline: t(11) = 4.92, p < 0.001; flupenthixol: t(11) = 3.54, p < 0.01; Fig. 3E; Table 3) with no change in consumption of the Initial Preferred reward (post hoc Sidak's test; saline: t(11) = 0.91, p = 0.62; flupenthixol: t(11) = 1.77, p = 0.20; Fig. 3E; Table 3). The change in reward preference was positively related to the consumption of the Initial Less Preferred reward (saline: r2 = 0.72, p < 0.001, flupenthixol: r2 = 0.74, p < 0.001; Fig. 3F; Table 3) and inversely related to the consumption of the Initial Preferred reward in both saline- and flupenthixol-treated animals (saline: r2 = 0.54, p < 0.01, flupenthixol: r2 = 0.46, p = 0.02; Fig. 3G; Table 3). The only sex difference identified was in flupenthixol-treated rats where there was a longer latency to retrieve the pellet in female rats (three-way mixed-effects analysis; sex effect: F(1,40) = 5.44, p = 0.03; session × sex effect: F(4,40) = 2.23, p = 0.08; Table 3), which is consistent with prior work demonstrating sex-dependent effects of flupenthixol on latency (Eubig et al., 2014). These data collectively demonstrate that the dopamine system regulates both anticipatory head entries and the latency to respond during training sessions but does not mediate the increased preference for the less desirable reward elicited by high temporal costs.
Statistical analyses for Figure 3
The enhanced preference for the Initial Less Preferred reward delivered after long delays does not involve dopamine signaling. A, Training schematic. B, Anticipatory head entries into the food port in rats that received injections of saline or flupenthixol. C, Latency to make a head entry into the food port after a reward is delivered during training sessions in rats that received injections of saline or flupenthixol. D, Preference ratio in rats receiving saline (left) or flupenthixol (right) injections. E, Reward consumption for each flavor during the preference tests in rats receiving saline (left) or flupenthixol (right) injections. F, G, Linear regression relating the change in food consumption as a function of the change in the preference ratio. *p < 0.05. **p < 0.01. ***p < 0.001.
We then examined whether the OFC could mediate the change in preference as lesions to the OFC alter decision-making in timing-related tasks (Kheramin et al., 2004; Winstanley et al., 2004). Additionally, the OFC encodes value-based parameters and participates in reward-learning and decision-making (Rolls, 2004; Schoenbaum et al., 2009; McDannald et al., 2011; Padoa-Schioppa, 2013; Rhodes and Murray, 2013; Stalnaker et al., 2015; Izquierdo, 2017; Padoa-Schioppa and Conen, 2017). We performed excitotoxic lesions of the OFC or sham surgeries before the initial preference test (Fig. 4A,B). There was no difference in anticipatory head entries into the food port between sham and OFC-lesioned rats across training sessions (three-way mixed-effects analysis; session effect: F(4,72) = 42.42, p = 0.0001; treatment effect: F(1,18) = 1.33, p = 0.27; delay effect: F(1,18) = 0.43, p = 0.52; interaction effect: F(4,72) = 0.25, p = 0.91; n = 10 sham rats, n = 10 lesion rats; Fig. 4C; Table 4). The latency to retrieve the food pellet also did not differ between sham and OFC-lesioned rats across training sessions (three-way mixed-effect analysis; session effect: F(4,72) = 42.42, p < 0.0001; treatment effect: F(1,18) = 0.10, p = 0.76; delay effect: F(1,18) = 0.05, p = 0.82; interaction effect: F(4,72) = 0.90, p = 0.47; Fig. 4D; Table 4). Lesioning the OFC did not prevent the change in preference following training (two-way mixed-effects analysis; training effect: F(1,18) = 28.64, p < 0.0001; treatment effect: F(1,18) = 0.35, p = 0.56; interaction effect: F(1,18) = 0.0001, p = 0.99; Fig. 4E; Table 4). Training resulted in an enhanced preference for the Initial Less Preferred reward in sham surgery and OFC-lesioned rats (post hoc Sidak's test; sham: t(18) = 3.78, p < 0.01; lesion: t(18) = 3.79, p < 0.01; Fig. 4E; Table 4). Both groups exhibited a selective increase in the consumption of the Initial Less Preferred reward (post hoc Sidak's test; sham: t(9) = 3.89, p < 0.01; lesion: t(9) = 3.26, p = 0.02; Fig. 4F; Table 4) with no change in the consumption of the Initial Preferred reward (post hoc Sidak's test; sham: t(9) = 0.47, p = 0.88; lesion: t(9) = 0.73, p = 0.73; Fig. 4F; Table 4). The change in reward preference was positively related to the consumption of the Initial Less Preferred reward (sham: r2 = 0.95, p < 0.001, lesion: r2 = 0.85, p < 0.001; Fig. 4G; Table 4) and inversely related to the consumption of the Initial Preferred reward in sham surgery and OFC-lesioned rats (sham: r2 = 0.68, p < 0.01, lesion: r2 = 0.48, p = 0.03; Fig. 4H; Table 4). Therefore, the OFC is not involved with the enhanced preference for an initially less desirable reward that follows a long delay in training sessions.
Statistical analyses for Figure 4
The OFC is not required for the enhanced preference for the Initial Less Preferred reward delivered after long delays. A, Training schematic. B, Top, The extent of OFC lesions across three coronal planes with the anterior distance from bregma (millimeters) indicated. Bottom, Representative OFC lesion. C, Anticipatory head entries into the food port in sham or OFC-lesioned rats. D, Latency to make a head entry into the food port after a reward is delivered during training sessions in sham or OFC-lesioned rats. E, Preference ratio in sham (left) or OFC-lesioned (right) rats. F, Reward consumption for each flavor during the preference tests in sham (left) or OFC-lesioned (right) rats. G, H, Linear regression relating the change in food consumption as a function of the change in the preference ratio. *p < 0.05. **p < 0.01.
Increasing evidence highlights the BLA is a critical nucleus that contributes to learning, timing-related decisions, reward valuation, and reward seeking (Maren, 1999; Winstanley et al., 2004; Ambroggi et al., 2008; Namburi et al., 2015; Wassum and Izquierdo, 2015; Malvaez et al., 2019; Morse et al., 2020). As such, the BLA could potentially mediate the change in reward preference following the training regimen where the Initial Less Preferred reward was delivered after long delays. Rats underwent a surgery to lesion the BLA or a sham procedure before the initial preference test (Fig. 5A,B). The anticipatory head entries into the food port did not differ between sham and BLA-lesioned rats across training sessions (three-way mixed-effects analysis; session effect: F(2.94,41.22) = 10.45, p < 0.0001; treatment effect: F(1,14) = 0.01, p = 0.92; delay effect: F(1,14) = 0.004, p = 0.95; interaction effect: F(4,56) = 0.50, p = 0.74; n = 8 sham rats, n = 8 lesion rats; Fig. 5C; Table 5). There was also no difference in the latency to retrieve a food pellet between sham surgery and BLA-lesioned rats across training sessions (three-way mixed-effect analysis; session effect: F(4,56) = 71.96, p < 0.0001; treatment effect: F(1,14) = 0.19, p = 0.67; delay effect: F(1,14) = 0.26, p = 0.62; interaction effect: F(4,56) = 0.05, p = 0.99; Fig. 5D; Table 5). However, lesioning the BLA prevented the change in preference following training (two-way mixed-effects analysis; training effect: F(1,14) = 30.69, p < 0.0001; treatment effect: F(1,14) = 3.55, p = 0.08; interaction effect: F(1,14) = 15.17, p < 0.01; Fig. 5E; Table 5). Sham surgery rats exhibited an enhanced preference for the Initial Less Preferred reward (post hoc Sidak's test; sham: t(14) = 6.67, p < 0.0001; Fig. 5E; Table 5), which was because of a selective increase in the consumption of the Initial Less Preferred reward (post hoc Sidak's test; Initial Less Preferred flavor: t(7) = 5.99, p < 0.01; Initial Preferred flavor: t(7) = 2.2, p = 0.12; Fig. 5F; Table 5). In contrast, BLA lesions prevented the change in reward preference (post hoc Sidak's test; lesion: t(14) = 1.16, p = 0.46; Fig. 5E; Table 5), as rats selectively increased the consumption of the Initial Preferred reward (post hoc Sidak's test; Initial Less Preferred flavor: t(7) = 2.4, p = 0.09; Initial Preferred flavor: t(7) = 4.13, p < 0.01; Fig. 5F; Table 5). The change in reward preference in sham surgery rats was inversely related to the change in consumption of the Initial Preferred reward (Initial Less Preferred: r2 = 0.02, p = 0.14; Initial Preferred: r2 = 0.60, p = 0.02; Fig. 5G,H; Table 5). A similar trend was observed in BLA-lesioned rats (Initial Less Preferred: r2 = 0.38, p = 0.1; Initial Preferred: r2 = 0.52, p = 0.04; Fig. 5G,H; Table 5). These data collectively illustrate that the BLA is required for the enhanced preference for a less desirable reward associated with high temporal costs.
Statistical analyses for Figure 5
The BLA is required for the enhanced preference for the Initial Less Preferred reward delivered after long delays. A, Training schematic. B, Top, The extent of BLA lesions across four coronal planes with the anterior distance from bregma (millimeters) indicated. Bottom, Representative BLA lesion. C, Anticipatory head entries into the food port in sham or BLA-lesioned rats. D, Latency to make a head entry into the food port after a reward is delivered during training sessions in sham or BLA-lesioned rats. E, Preference ratio in sham (left) or BLA-lesioned (right) rats. F, Reward consumption for each flavor during the preference tests in sham (left) or BLA-lesioned (right) rats. G, H, Linear regression relating the change in food consumption as a function of the change in the preference ratio. **p < 0.01. ****p < 0.0001.
The RSC is another brain region that could participate in sustained changes in reward preference given its role in learning and timing-related decisions, and that RSC neurons respond to rewards to encode value-related signals (Todd et al., 2015, 2019; Vedder et al., 2017; Hattori et al., 2019; Fischer et al., 2020). To address this possibility, rats underwent an RSC lesion or a sham surgery before the initial preference test (Fig. 6A,B). We found a significant interaction between treatment and delay on the anticipatory head entries between sham and RSC-lesioned rats (three-way mixed-effects analysis; session effect: F(3.09,43.21) = 14.14, p < 0.0001; treatment effect: F(1,53) = 0.10, p = 0.75; delay effect: F(1,14) = 0.06, p = 0.81; treatment × delay effect: F(1,53) = 5.04, p = 0.03; interaction effect: F(4,53) = 2.44, p = 0.06; n = 8 sham rats, n = 8 lesion rats; Fig. 6C; Table 6). There was no difference in the post-reward latency to retrieve a food pellet between sham surgery and RSC-lesioned rats across training sessions (three-way mixed-effect analysis; session effect: F(4,56) = 43.99, p < 0.0001; treatment effect: F(1,53) = 0.18, p = 0.68; delay effect: F(1,14) = 0.59, p = 0.45; interaction effect: F(4,53) = 1.26, p = 0.30; Fig. 6D; Table 6). Lesioning the RSC prevented the change in preference following training (two-way mixed-effects analysis; training effect: F(1,14) = 10.64, p < 0.01; treatment effect: F(1,14) = 12.69, p < 0.01; interaction effect: F(1,14) = 17.36, p = 0.001; Fig. 6E; Table 6). Sham rats exhibited an enhanced preference for the Initial Less Preferred reward (post hoc Sidak's test; sham: t(14) = 5.25, p = 0.0002; Fig. 6E; Table 6), which was accompanied by an increased consumption of both rewards following training (post hoc Sidak's test; Initial Less Preferred flavor: t(7) = 4.31, p < 0.01; Initial Preferred flavor: t(7) = 2.95, p = 0.04; Fig. 6F; Table 6). RSC-lesioned rats consumed a greater amount of the Initial Preferred reward relative to sham controls during the pretraining preference test (post hoc Sidak's test; pretraining: t(28) = 2.98, p = 0.01; Table 6). There was no change in preference following training (post hoc Sidak's test; lesion: t(14) = 0.64, p = 0.78; Fig. 6E; Table 6), as RSC lesioned rats exhibited a further increase in the consumption of the Initial Preferred reward (post hoc Sidak's test; Initial Less Preferred flavor: t(7) = 0.28, p = 0.95; Initial Preferred flavor: t(7) = 10.63, p < 0.0001; Fig. 6F; Table 6). The relative change in reward preference and consumption of the Initial Less Preferred reward was positively correlated across RSC lesion and sham groups (Initial Less Preferred sham: r2 = 0.52, p = 0.04, lesion: r2 = 0.87, p < 0.001; Initial Preferred sham: r2 = 0.02, p = 0.74; lesion: r2 = 0.36, p = 0.11; Fig. 6G,H; Table 6). Our results highlight that both the BLA and RSC are necessary for increasing the preference toward an initial less desirable reward option associated with sunk temporal costs.
Statistical analyses for Figure 6
The RSC is required for the enhanced preference for the Initial Less Preferred reward delivered after long delays. A, Training schematic. B, Top, The extent of RSC lesions across four coronal planes with the anterior distance from bregma (millimeters) indicated. Bottom, Representative RSC lesion. C, Anticipatory head entries into the food port in sham or RSC-lesioned rats. D, Latency to make a head entry into the food port after a reward is delivered during training sessions in sham or RSC-lesioned rats. E, Preference ratio in sham (left) or RSC-lesioned (right) rats. F, Reward consumption for each flavor during the preference tests in sham (left) or RSC-lesioned (right) rats. G, H, Linear regression relating the change in food consumption as a function of the change in the preference ratio. *p < 0.05. **p < 0.01. ***p < 0.001. ****p < 0.0001.
When examining all subjects trained to experience the Initial Less Preferred reward after a Long Delay (Figs. 2-6), the change in reward preference was positively related to the consumption of the Initial Less Preferred reward (r2 = 0.76, p < 0.001) and inversely related to the consumption of the Initial Preferred reward (r2 = 0.47, p < 0.001; n = 88 rats). This suggests that rats exhibiting a robust increase in preference toward the Initial Less Preferred reward could potentially decrease the consumption of the Initial Preferred reward following training. We therefore analyzed the relative food consumption across subjects based on whether there was a mild increase in preference toward the Initial Preferred reward (Fig. 7A, orange), a mild increase in preference toward the Initial Less Preferred reward (Fig. 7A, light blue), or a robust increase in preference toward the Initial Less Preferred reward (Fig. 7A, dark blue). The relative change in food consumption for each reward differed according to the change in reward preference (two-way mixed-effects analysis; flavor effect: F(1,85) = 3.89, p = 0.05; change in preference effect: F(2,85) = 5.78, p < 0.01; interaction effect: F(2,85) = 116.2, p < 0.0001; Fig. 7B–D; Table 7). Rats exhibiting a mild increase in preference toward the original reward preference selectively increased the consumption of the Initial Preferred reward (one-sample t test relative to 0: Initial Less Preferred: t(15) = 0.69, p = 0.50; Initial Preferred: t(15) = 8.39, p < 0.0001; Fig. 7B; Table 7). Rats with a mild increase in preference toward the Initial Less Preferred reward increased the consumption of both rewards (one-sample t test relative to 0: Initial Less Preferred: t(44) = 88.88, p < 0.0001; Initial Preferred: t(44) = 5.47, p < 0.0001; Fig. 7C; Table 7). Interestingly, rats with a robust increase in preference toward the Initial Less Preferred reward increased the consumption of the Initial Less Preferred and decreased the consumption of the Initial Preferred reward (one-sample t test relative to 0: Initial Less Preferred: t(26) = 16.99, p < 0.0001; Initial Preferred: t(26) = 2.23, p = 0.04; Fig. 7D; Table 7). Together, these results highlight that sunk temporal costs can positively impact the value of an Initial Less Preferred reward as well as negatively impact the value of the alternative initially preferred option.
Relating the change in preference to the change in the food consumption. A, Change in the preference ratio across all rats that underwent the Different Delay training sessions: Initial Less Preferred after long delays. Color overlays represent a mild increase in preference toward the Initial Preferred reward (orange; change in the preference ratio < 0), a mild increase in preference toward the Initial Less Preferred reward (light blue; change in the preference ratio between 0 and 0.4), and a robust increase in preference toward the Initial Less Preferred reward (dark blue; change in the preference ratio > 0.4). B, Change in the food consumption in rats that displayed a mild increase in preference toward the Initial Preferred reward. C, Change in the food consumption in rats that displayed a mild increase in preference toward the Initial Less Preferred reward. D, Change in the food consumption in rats that displayed a robust increase in preference toward the Initial Less Preferred reward. *p < 0.05, ****p < 0.0001.
Statistical analyses for Figure 7
Discussion
Prior research has identified a number of factors that can alter reward preference. For example, allowing free access to one reward option before a preference test will devalue that reward and elicit a transient change in preference (Roesch et al., 2007; Cone et al., 2016; Papageorgiou et al., 2016). Long-term changes in preference can be induced by pairing a reward with an aversive outcome (Colby and Smith, 1977). Here, we demonstrate that the temporal delay before a reward delivery can enhance the preference for an initially less desirable option. In the first experiment, the Initial Less Preferred reward was delivered after a short delay while the Initial Preferred reward was delivered after a long delay. Under this training regimen, rats increased their consumption and preference for the Initial Less Preferred option. These results indicate reward preference was enhanced by the higher rate of reward during Short Delay training sessions, which is consistent with decisions influenced by maximizing the reward rate (Khodadadi et al., 2014). This behavioral phenomenon is also in line with temporal discounting studies exhibiting increased preference for a reward delivered after a short delay/higher reward rate (Logan, 1965; Ainslie, 1974; Rodriguez and Logue, 1988; Evenden and Ryan, 1996; Richards et al., 1997; Green and Myerson, 2004; Peterson et al., 2015; Vanderveldt et al., 2016). We note that the change in reward preference is likely not mediated by differences in satiety between training sessions since rats underwent only a single training session per day and the same number of food pellets were delivered in each training session. Differences in reward expectation are not mediating the change in preference since there was no difference in anticipatory responding between Long and Short Delay sessions. Furthermore, in control experiments, we found that rats maintained their initial reward preference when the delay to the reward delivery was held constant for both flavors, which illustrates that rats are appropriately discriminating between the reward options within this task design.
Our findings demonstrate that the impact of temporal delays on enhancing the preference for a particular reward depends on whether that option was initially preferred or not. We identified an asymmetry in which temporal delays only increase the preference for the initial less desirable option. Rats did not exhibit a stronger preference for the Initial Preferred option when this reward was delivered after short delays. Rather, rats trained in this manner increased their preference for the Initial Less Preferred option that was delivered after long delays. We propose these results reflect the impact of sunk costs on reward preference (Clement et al., 2000; Marsh and Kacelnik, 2002; Friedrich and Zentall, 2004; Navarro and Fantino, 2005; Alessandri et al., 2008). Human studies have identified behavioral consequences from a variety of sunk costs that are imposed on the subject, including embarrassment, political, personal, and financial, although these costs are challenging to model in animals (Aronson and Mills, 1959; Staw, 1976; Strube, 1988; Haller and Schwabe, 2014; Fujino et al., 2016). Recent studies have examined the behavioral impact of sunk temporal costs using a choice-based foraging task (Wikenheiser et al., 2013; Abram et al., 2016; Sweis et al., 2018). In this foraging task, choosing one option comes at the expense of a potential alternative outcome, so one cannot determine whether sunk costs increase the value of the chosen outcome and/or decrease the value of the alternative outcome. Our behavioral paradigm allowed us to determine that a robust change in preference toward an Initial Less Preferred reward delivered after long temporal delays was a product of changes in consumption of both rewards: increased consumption of the Initial Less Preferred reward as well as decreased consumption of the Initial Preferred reward. Together, these findings indicate that a high temporal cost before the delivery of a less desirable reward can positively impact the value of that reward as well as negatively impact the value of the alternative option.
Correlative studies have linked neural activity to the behavioral effects of sunk costs (Fujino et al., 2016, 2018). Here we aimed to elucidate the neural systems that are required for sunk temporal costs to enhance the preference for an initially less desirable reward. We concentrated on brain regions involved with timing, reward valuation, and reward learning. Dopamine signals reflect changes in preference induced by state-specific satiety (Cone et al., 2016; Papageorgiou et al., 2016). Additionally, dopamine receptor antagonism impairs preference changes arising from conditioned taste aversion (Fenu et al., 2001). In our task, we found that antagonizing dopamine receptors during training sessions increased the latency to retrieve the reward and decreased anticipatory head entries into the food port, consistent with dopamine's role in regulating both locomotor activity and the acquisition of anticipatory responding (Pitts and Horvitz, 2000; Flagel et al., 2011; Trost and Hauber, 2014). Despite the motoric effects of dopamine receptor antagonism during training sessions, rats still displayed enhanced preference for the initial less desirable reward. Therefore, behavioral performance during training sessions is not linked to a subsequent change in reward preference.
OFC lesions failed to prevent changes in reward preference, which agrees with previous reports that lesions to the OFC did not alter outcome preference or behavioral flexibility (Keiflin et al., 2013; Rudebeck et al., 2013). However, prior research has found that OFC lesions impair the ability to update choices following selective satiation (Rhodes and Murray, 2013). In addition, OFC lesions do not prevent changes in preference induced by taste aversion but can disrupt devaluation of the associated cue (Gallagher et al., 1999). While the dopamine system and the OFC are active participants in other forms of preference changes, our findings suggest these two systems are not involved with sunk temporal costs enhancing the preference for an initially less desirable reward option.
Our results demonstrate that lesioning the BLA before training sessions prevented temporal costs from influencing changes in preference. Previous studies have implicated BLA neurons in reward learning and memory formation (Namburi et al., 2015; Morse et al., 2020). The BLA also has a role in reward preference as studies demonstrate the BLA maintains representations of appetitive and economic value (Cador et al., 1989; Winstanley et al., 2004; Leathers and Olson, 2017). BLA neurons additionally encode the value of a chosen reward and inactivating the BLA decreased choice for a more preferred option (Hart and Izquierdo, 2017; Jezzini and Padoa-Schioppa, 2020). Moreover, BLA inactivation impairs choice following reinforcer devaluation (West et al., 2012; Hart and Izquierdo, 2017). Our findings coupled with prior work collectively highlight a critical role for the BLA in updating reward value.
The RSC has been well studied for its contributions to memory formation, spatial navigation, and timing (Vann et al., 2009; Todd et al., 2015, 2019; Miller et al., 2019; Trask et al., 2021). However, increasing evidence illustrates that the RSC also contains reward-responsive neurons that encode reward value (Vedder et al., 2017; Hattori et al., 2019; Fischer et al., 2020). Furthermore, inactivation of the RSC impaired the ability to adapt behavior based on the reward history (Hattori et al., 2019). Consistent with the RSC's role in reward-based behavior, RSC lesions before training sessions prevented the change in preference induced by past temporal costs. RSC lesions also differentially altered the anticipatory responding between training sessions. However, as discussed above, the behavioral responses during training sessions are not predictive of changes in reward preference.
Together, we find that lesions of the BLA or the RSC prevented how temporal delays increase the preference for an initially less desirable option. However, it is possible that the role of the BLA and RSC extends beyond our behavioral task and that these regions instead play a more general role in updating reward value and preference. Our data indicate the presence of a circuit involving the BLA and RSC to update reward value. In support, the BLA sends direct projections to the RSC (Buckwalter et al., 2008; Hintiryan et al., 2021). However, further studies are needed to determine whether the BLA and RSC are responsible for updating preference (during training sessions) and/or expressing preference changes (during the free-feeding test). Similar behavioral effects were observed across male and female rats, although the lesion experiments were only performed in male rats. Future work will be needed to verify the role of the BLA and RSC in changing reward preference in female rats. Collectively, our data highlight a previously unappreciated role for the BLA and RSC in mediating enhanced preference for a less desirable reward that follows a long temporal delay.
Footnotes
This work was supported by National Institutes of Health Grants DA033386 and DA042362 to M.J.W.; and the Mind Science Foundation Research Award to M.J.L.
The authors declare no competing financial interests.
- Correspondence should be addressed to Matthew J. Wanat at Matthew.wanat{at}utsa.edu