Skip to main content

Main menu

  • HOME
  • CONTENT
    • Early Release
    • Featured
    • Current Issue
    • Issue Archive
    • Collections
    • Podcast
  • ALERTS
  • FOR AUTHORS
    • Information for Authors
    • Fees
    • Journal Clubs
    • eLetters
    • Submit
    • Special Collections
  • EDITORIAL BOARD
    • Editorial Board
    • ECR Advisory Board
    • Journal Staff
  • ABOUT
    • Overview
    • Advertise
    • For the Media
    • Rights and Permissions
    • Privacy Policy
    • Feedback
    • Accessibility
  • SUBSCRIBE

User menu

  • Log out
  • Log in
  • My Cart

Search

  • Advanced search
Journal of Neuroscience
  • Log out
  • Log in
  • My Cart
Journal of Neuroscience

Advanced Search

Submit a Manuscript
  • HOME
  • CONTENT
    • Early Release
    • Featured
    • Current Issue
    • Issue Archive
    • Collections
    • Podcast
  • ALERTS
  • FOR AUTHORS
    • Information for Authors
    • Fees
    • Journal Clubs
    • eLetters
    • Submit
    • Special Collections
  • EDITORIAL BOARD
    • Editorial Board
    • ECR Advisory Board
    • Journal Staff
  • ABOUT
    • Overview
    • Advertise
    • For the Media
    • Rights and Permissions
    • Privacy Policy
    • Feedback
    • Accessibility
  • SUBSCRIBE
PreviousNext
Research Articles, Behavioral/Cognitive

Dopamine and Norepinephrine Differentially Mediate the Exploration–Exploitation Tradeoff

Cathy S. Chen, Dana Mueller, Evan Knep, R. Becket Ebitz and Nicola M. Grissom
Journal of Neuroscience 30 October 2024, 44 (44) e1194232024; https://doi.org/10.1523/JNEUROSCI.1194-23.2024
Cathy S. Chen
1Department of Psychology, University of Minnesota, Minneapolis, Minnesota 55455
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Cathy S. Chen
Dana Mueller
1Department of Psychology, University of Minnesota, Minneapolis, Minnesota 55455
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Evan Knep
1Department of Psychology, University of Minnesota, Minneapolis, Minnesota 55455
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
R. Becket Ebitz
2Department of Neurosciences, Université de Montréal, Montréal, Quebec H3T 1J4, Canada
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Nicola M. Grissom
1Department of Psychology, University of Minnesota, Minneapolis, Minnesota 55455
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • Article
  • Figures & Data
  • Info & Metrics
  • eLetters
  • PDF
Loading

Article Figures & Data

Figures

  • Extended Data
  • Figure 1.
    • Download figure
    • Open in new tab
    • Download powerpoint
    Figure 1.

    Modulation of dopamine receptor activity affected reward-directed behaviors. Upregulating dopamine activity increased stickiness in choice behaviors regardless of outcomes. Changes in response time and performance across sessions and within session across time are demonstrated in Extended Data Figures 1-1 and 1-2. A, Schematic of the mouse touchscreen chamber and the trial structure of the two-armed spatial restless bandit task. B, An example of the dynamic reward contingency showing the changing reward probabilities associated with each option in a single session. C, The dopaminergic drug administration schedule and drug dosage used to modulate dopamine receptor activity. Left, A nonselective dopamine receptor agonist apomorphine (0.1 mg/kg) and, right, a nonselective dopamine receptor antagonist flupenthixol (0.03 mg/kg) were systemically administered to respectively up- and downregulate dopamine receptor activity. A total of 0.9% saline was used as the vehicle control. Control and drug sessions were interleaved and repeated for six sessions each. D, Left, Average probability of obtaining reward compared with the chance level probability of reward for each animal under apomorphine (APO) and vehicle (dot). Error bars depict mean ± SEM across sessions for each animal. Right, Probability of obtaining reward over chance level across APO (dark green) and vehicle (gray). E, Average response time under APO and vehicle. F, Left, Average probability of obtaining reward compared with the chance level probability of reward for each animal under flupenthixol (FLU) and vehicle (dot). Error bars depict mean ± SEM across sessions for each animal. Right, Probability of obtaining reward over chance level across FLU (light green) and vehicle (gray). G, Average response time under FLU and vehicle. H, Average probability of win-stay on vehicle control or APO. I, Average lose-shift on vehicle control or APO. J, Average outcome sensitivity across sessions under APO and vehicle. K, Average probability of win-stay on vehicle control or FLU. L, Average lose-shift on vehicle control or FLU. M, Average outcome sensitivity across sessions under FLU and vehicle. * indicates p < 0.05. Graphs depict mean ± SEM across animals unless specified otherwise.

  • Figure 2.
    • Download figure
    • Open in new tab
    • Download powerpoint
    Figure 2.

    Modulation of beta-noradrenergic receptor activity bidirectionally changed sensitivity to outcome. The effect of beta-noradrenergic modulation on win-stay and lose-shift was also influenced by sex. A, The noradrenergic drug administration schedule and drug dosage used to modulate noradrenergic receptor activity. Left, A beta-noradrenergic receptor agonist isoproterenol (0.3 mg/kg) and, right, a beta-noradrenergic receptor antagonist propranolol (5 mg/kg) were systemically administered to respectively up- and downregulate norepinephrine activity. A total of 0.9% saline was used as the vehicle control. Control and drug sessions were interleaved and repeated for six sessions each. B, Left, Average probability of obtaining reward compared with the chance level probability of reward for each animal under isoproterenol (ISP) and vehicle (dot). Error bars depict mean ± SEM across sessions for each animal. Right, Probability of obtaining reward over chance level across ISP (dark brown) and vehicle (gray). C, Average response time under ISP and vehicle. ISP significantly increased response time. D, Left, Average probability of obtaining reward compared with the chance level probability of reward for each animal under propranolol (PRO) and vehicle (dot). Error bars depict mean ± SEM across sessions for each animal. Right, Probability of obtaining reward over chance level across PRO (light brown) and vehicle (gray). E, Average response time under PRO and vehicle. PRO significantly decreased response time. F, Average probability of win-stay under ISP and vehicle. G, Average probability of lose-shift under ISP and vehicle. Inset, Probability of lose-shift across treatments across sexes. Decrease in lose-shift under ISP was primarily driven by changes in females (interaction term). H, Average probability of win-stay under PRO and vehicle. Inset, Probability of win-stay across treatments across sexes. Increase in win-stay under PRO was primarily driven by changes in males (interaction term). I, Average probability of lose-shift under PRO and vehicle. Inset, Probability of lose-shift across treatments across sexes. Decrease in lose-shift under PRO was primarily driven by changes in males (interaction term). J, Average outcome sensitivity across sessions under ISP and vehicle. ISP decreased outcome sensitivity. K, Outcome sensitivity across sexes under ISP and vehicle. L, Average outcome sensitivity across sessions under PRO and vehicle. PRO increased outcome sensitivity. M, Outcome sensitivity across sexes under PRO and vehicle. * indicates p < 0.05. Graphs depict mean ± SEM across animals unless specified otherwise.

  • Figure 3.
    • Download figure
    • Open in new tab
    • Download powerpoint
    Figure 3.

    Up- and downregulation of dopamine activity had bidirectional effects on the level of exploration. The neuromodulatory effect on exploration across sexes is demonstrated in Extended Data Figure 3-1. A, Left, Structure of a HMM that modeled exploration and exploitation as latent goal states underlying observed choices. This model incorporates two exploit states for two choices and one explore state where choices were uniformly distributed across options. Right, Reward probabilities, choices, and HMM labels for an example session of 300 trials for a given mouse. Shaded areas demonstrated HMM-labeled exploratory choices. B, Probability of choice as a function of value differences between choices for exploratory and exploitative states under vehicle (left) and flupenthixol (FLU; right). C, D, Difference in response time between explore and exploit choices under vehicle and flupenthixol (FLU). E, Probability of choice as a function of value differences between choices for exploratory and exploitative states under vehicle (left) and apomorphine (APO) (right). F, G, Difference in response time between explore and exploit choices under vehicle and apomorphine (APO). H, Distribution of the percentage of HMM-labeled exploratory choices under flupenthixol (FLU)/vehicle (top) and apomorphine (APO)/vehicle (bottom). I, Probability of exploration for dopamine antagonist (FLU) and agonist (APO), normalized by their vehicle control. Decreasing dopamine activity increased exploration and increasing dopamine activity decreased exploration. J, Probability of exploration by session with vehicle and drug session interleaved (flupenthixol, top; apomorphine, bottom). Drug administration sessions are in colored shades. * indicates p < 0.05. Graphs depict mean ± SEM across animals.

  • Figure 4.
    • Download figure
    • Open in new tab
    • Download powerpoint
    Figure 4.

    Downregulating norepinephrine activity influenced exploration but the effect was modulated by sex. A, Probability of choice as a function of value differences between choices for exploratory and exploitative states under vehicle (left) and propranolol (PRO) (right). B, C, Difference in response time between explore and exploit choices under vehicle and propranolol (PRO). D, Probability of choice as a function of value differences between choices for exploratory and exploitative states under vehicle (left) and isoproterenol (ISP) (right). E, F, Difference in response time between explore and exploit choices under vehicle and isoproterenol (ISP). G, Distribution of the percentage of HMM-labeled exploratory choices under propranolol (PRO)/vehicle (top) and under isoproterenol (ISP)/vehicle (bottom). H, Probability of exploration for beta-noradrenergic antagonist (PRO) and agonist (ISP), normalized by their vehicle control. Propranolol decreased the level of exploration. I, Probability of exploration by session with vehicle and drug session interleaved. Drug administration sessions are in colored shades. Top, Propranolol (PRO) condition; bottom, isoproterenol (ISP) condition. J, Probability of exploration under PRO and vehicle across sexes. The effect of PRO on exploration was primarily driven by a significant decrease in exploration under PRO in males. K, Probability of exploration under ISP and vehicle across sexes. ISP significantly decreased exploration in females but not males. * indicates p < 0.05. Graphs depict mean ± SEM across animals.

  • Figure 5.
    • Download figure
    • Open in new tab
    • Download powerpoint
    Figure 5.

    Up- and downregulating dopamine receptor activity bidirectionally modulated decision noise parameter in a reinforcement learning (RL) model. The correlation between HMM parameter and RL model parameters are shown in Extended Data Figure 5-1. Choice kernel updating rates from RL model were not affected by modulation and are shown in Extended Data Figure 5-2. A, A diagram of RL parameters that capture learning rate (α), asymmetric learning (γ), choice bias (αc), and inverse temperature/decision noise (β). The RL models tested included one or more of the above parameters. B, Model likelihood for six RL models using log likelihood for vehicle and apomorphine (APO) condition (top) and vehicle and flupenthixol (FLU) condition (bottom). The three-parameter RLCK model was the best fit and most parsimonious model for behaviors. C, Model agreement for the best fit model (RLCK), which measures the probability of the model predicting the actual choices of animals for vehicle and APO (top) and vehicle and FLU condition (bottom). D, E, Animal choices (purple) and simulation (green) from the best fit model (RLCK) for the same animal under vehicle/APO and vehicle/FLU. The gray line represents the reward probability of the left choice. F, Average inverse temperature (β) across APO and vehicle. Inset, Inverse temperature across conditions after removing two outliers. G, Average inverse temperature (β) across FLU and vehicle. Inset, Inverse temperature across conditions after removing two outliers. H, Agonizing (APO) and antagonizing (FLU) dopamine activity revealed bidirectional effect on inverse temperature. I, Average learning rate (α) across FLU and vehicle. Flupenthixol increased learning rate. J, Average learning rate (α) across APO and vehicle. * indicates p < 0.05. Graphs depict mean ± SEM across animals.

  • Figure 6.
    • Download figure
    • Open in new tab
    • Download powerpoint
    Figure 6.

    Modulation of beta-noradrenergic receptor activity led to changes in the learning rate parameter in a reinforcement learning (RL) model. A, Model likelihood for six RL models using log likelihood for vehicle and isoproterenol (ISP) condition (top) and vehicle and propranolol (PRO) condition (bottom). B, Model agreement for the best fit model (RLCK), which measures the probability of the model predicting the actual choices of animals for vehicle and isoproterenol (ISP) condition (top) and vehicle and propranolol (PRO) condition (bottom). C, Average learning rate (α) across ISP and vehicle. D, Average learning rate (α) across PRO and vehicle. E, PRO and ISP had different effects on learning rate, normalized by the vehicle control for each drug condition. F, Average inverse temperature (β) across PRO and vehicle. G, Average inverse temperature (β) across sexes under vehicle and PRO condition. H, Average inverse temperature (β) across ISP and vehicle. I, Average inverse temperature (β) across sexes under vehicle and ISP condition. Females had significantly higher inverse temperature (lower decision noise). J, K, Animal choices (purple) and simulation (green) from the best fit model (RLCK) for the same animal under vehicle/PROand vehicle/ISP. The gray line represents the reward probability of the left choice. * indicates p < 0.05. Graphs depict mean ± SEM across animals.

Extended Data

  • Figures
  • Figure 1-1

    Average correct performance and response time across sessions. A) Average probability of choosing the best choice at any given trial (correct performance) across sessions under flupenthixol (FLU)/vehicle and apomorphine (APO)/vehicle. B) Average probability of choosing the best choice at any given trial (correct performance) across sessions under propranolol (PRO)/vehicle and isoproterenol (ISP)/vehicle. C) Average response time across sessions under flupenthixol (FLU)/vehicle and apomorphine (APO)/vehicle. D) Average response time across sessions under propranolol (PRO)/vehicle and isoproterenol (ISP)/vehicle. Graphs depict mean ± SEM across animals. Download Figure 1-1, TIF file.

  • Figure 1-2

    Changes in task performance, response time and level of exploration within sessions across 30-minute time bins. A) Average probability of choosing the best choice at any given trial (highest payoff choice) within sessions across 30-minute time bins under flupenthixol (FLU)/vehicle and apomorphine (APO)/vehicle. B) Average probability of choosing the best choice at any given trial (highest payoff choice) within sessions across 30-minute time bins under propranolol (PRO)/vehicle and isoproterenol (ISP)/vehicle. C) Average response time within sessions across 30-minute time bins under flupenthixol (FLU)/vehicle and apomorphine (APO)/vehicle. D) Average response time within sessions across 30-minute time bins under propranolol (PRO)/vehicle and isoproterenol (ISP)/vehicle. E) Average probability of exploration inferred from Hidden Markov model (HMM) within sessions across 30-minute time bins under flupenthixol (FLU)/vehicle and apomorphine (APO)/vehicle. F) Average probability of exploration inferred from Hidden Markov model (HMM) within sessions across 30-minute time bins under propranolol (PRO)/vehicle and isoproterenol (ISP)/vehicle. Graphs depict mean ± SEM across animals. Download Figure 1-2, TIF file.

  • Figure 3-1

    Effect of dopamine and norepinephrine modulation on exploration in female and male mice. Download Figure 3-1, TIF file.

  • Figure 5-1

    Correlation between reinforcement learning (RL) model parameter learning rate and decision noise (inverse temperature) and probability of exploration as inferred from the Hidden Markov model (HMM) under all treatment conditions. Higher level of exploration was correlated with lower inverse temperature, i.e. higher decision noise. A) Correlation between learning rate (α) and probability of exploration under dopamine (DA) manipulations (flupenthixol (FLU) on the left and apomorphine (APO) on the right). B) Correlation between learning rate (α) and probability of exploration under beta-noradrenergic (NE) manipulations (propranolol (PRO) on the left and isoproterenol (ISP) on the right). C) Correlation between inverse temperature (β) and probability of exploration under dopamine (DA) manipulations (flupenthixol (FLU) on the left and apomorphine (APO) on the right). D) Correlation between inverse temperature (β) and probability of exploration under beta-noradrenergic (NE) manipulations (propranolol (PRO) on the left and isoproterenol (ISP) on the right). Spearman’s rho is reported. * indicates p < 0.05, ** indicates p < 0.01, *** indicates p < 0.001. Download Figure 5-1, TIF file.

  • Figure 5-2

    Choice kernel updating rate (αc) parameter in the reinforcement learning (RL) model was not affected by modulation of dopamine and noradrenergic receptor activity. A) Average choice kernel (αc) across flupenthixol (FLU) and vehicle. B) Average choice kernel (αc) across apomorphine (APO) and vehicle. C) Average choice kernel (αc) propranolol (PRO) and vehicle. D) Average choice kernel (αc) across isoproterenol (ISP) and vehicle. Graphs depict mean ± SEM across animals. Download Figure 5-2, TIF file.

Back to top

In this issue

The Journal of Neuroscience: 44 (44)
Journal of Neuroscience
Vol. 44, Issue 44
30 Oct 2024
  • Table of Contents
  • About the Cover
  • Index by author
  • Masthead (PDF)
Email

Thank you for sharing this Journal of Neuroscience article.

NOTE: We request your email address only to inform the recipient that it was you who recommended this article, and that it is not junk mail. We do not retain these email addresses.

Enter multiple addresses on separate lines or separate them with commas.
Dopamine and Norepinephrine Differentially Mediate the Exploration–Exploitation Tradeoff
(Your Name) has forwarded a page to you from Journal of Neuroscience
(Your Name) thought you would be interested in this article in Journal of Neuroscience.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Print
View Full Page PDF
Citation Tools
Dopamine and Norepinephrine Differentially Mediate the Exploration–Exploitation Tradeoff
Cathy S. Chen, Dana Mueller, Evan Knep, R. Becket Ebitz, Nicola M. Grissom
Journal of Neuroscience 30 October 2024, 44 (44) e1194232024; DOI: 10.1523/JNEUROSCI.1194-23.2024

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
Respond to this article
Request Permissions
Share
Dopamine and Norepinephrine Differentially Mediate the Exploration–Exploitation Tradeoff
Cathy S. Chen, Dana Mueller, Evan Knep, R. Becket Ebitz, Nicola M. Grissom
Journal of Neuroscience 30 October 2024, 44 (44) e1194232024; DOI: 10.1523/JNEUROSCI.1194-23.2024
Twitter logo Facebook logo Mendeley logo
  • Tweet Widget
  • Facebook Like
  • Google Plus One

Jump to section

  • Article
    • Abstract
    • Significance Statement
    • Introduction
    • Materials and Methods
    • Results
    • Discussion
    • Footnotes
    • References
  • Figures & Data
  • Info & Metrics
  • eLetters
  • PDF

Keywords

  • catecholamine
  • decision-making
  • dopamine
  • exploration–exploitation tradeoff
  • norepinephrine
  • reinforcement learning

Responses to this article

Respond to this article

Jump to comment:

No eLetters have been published for this article.

Related Articles

Cited By...

More in this TOC Section

Research Articles

  • Gene expression-based lesion-symptom mapping: FOXP2 and language impairments after stroke
  • Visual Distortions in Human Amblyopia Are Correlated with Deficits in Contrast Sensitivity
  • Distinct Portions of Superior Temporal Sulcus Combine Auditory Representations with Different Visual Streams
Show more Research Articles

Behavioral/Cognitive

  • Gene expression-based lesion-symptom mapping: FOXP2 and language impairments after stroke
  • Distinct Portions of Superior Temporal Sulcus Combine Auditory Representations with Different Visual Streams
  • Microsaccade Direction Reveals the Variation in Auditory Selective Attention Processes
Show more Behavioral/Cognitive
  • Home
  • Alerts
  • Follow SFN on BlueSky
  • Visit Society for Neuroscience on Facebook
  • Follow Society for Neuroscience on Twitter
  • Follow Society for Neuroscience on LinkedIn
  • Visit Society for Neuroscience on Youtube
  • Follow our RSS feeds

Content

  • Early Release
  • Current Issue
  • Issue Archive
  • Collections

Information

  • For Authors
  • For Advertisers
  • For the Media
  • For Subscribers

About

  • About the Journal
  • Editorial Board
  • Privacy Notice
  • Contact
  • Accessibility
(JNeurosci logo)
(SfN logo)

Copyright © 2025 by the Society for Neuroscience.
JNeurosci Online ISSN: 1529-2401

The ideas and opinions expressed in JNeurosci do not necessarily reflect those of SfN or the JNeurosci Editorial Board. Publication of an advertisement or other product mention in JNeurosci should not be construed as an endorsement of the manufacturer’s claims. SfN does not assume any responsibility for any injury and/or damage to persons or property arising from or related to any use of any material contained in JNeurosci.