Abstract
Despite its prevalence in studying the causal roles of different brain circuits in cognitive processes, electrical microstimulation often results in inconsistent behavioral effects. These inconsistencies are assumed to be due to multiple mechanisms, including habituation, compensation by other brain circuits, and contralateral suppression. Considering the presence of reinforcement in most experimental paradigms, we hypothesized that interactions between reward feedback and microstimulation could contribute to inconsistencies in behavioral effects of microstimulation. To test this, we analyzed data from electrical microstimulation of the frontal eye field of male macaques during a value-based decision–making task and constructed network models to capture choice behavior. We found evidence for microstimulation-dependent adaptation in saccadic choice, such that in stimulated trials, monkeys’ choices were biased toward the target in the response field of the microstimulated site (Tin). In contrast, monkeys showed a bias away from Tin in nonstimulated trials following microstimulation. Critically, this bias slowly decreased as a function of the time since the last stimulation. Moreover, microstimulation-dependent adaptation was influenced by reward outcomes in preceding trials. Despite these local effects, we found no evidence for the global effects of microstimulation on learning and sensitivity to the reward schedule. By simulating choice behavior across various network models, we found a model in which microstimulation and reward-value signals interact competitively through reward-dependent plasticity can best account for our observations. Our findings indicate a reward-dependent compensatory mechanism that enhances robustness to perturbations within the oculomotor system and could explain the inconsistent outcomes observed in previous microstimulation studies.
Significance Statement
Electrical microstimulation has been used to study the causal contributions of certain brain areas or circuits to cognition and behavior. Nonetheless, the overall impact of microstimulation on behavior remains inconclusive, hinting at neural mechanisms that interact with experimental perturbation of neural activity. We hypothesized that this interaction could be driven by the reward feedback animals receive while performing tasks, either with or without external perturbations. Using computational modeling and data from microstimulation during a reward-dependent decision–making task, we found microstimulation and reward-value signals competitively interact within the oculomotor system. This interaction enhances the system's robustness to both internal and external perturbations. Our results have important implications for employing microstimulation in basic and clinical research.
Introduction
Electrical microstimulation of the brain is one of the most widely used methods in systems neuroscience for uncovering the contributions of different brain circuits to cognition and behavior (Seidemann et al., 1998; Bradley et al., 2005; Hanks et al., 2006; Murphey and Maunsell, 2007; Schafer and Moore, 2007; Clark et al., 2011). In addition, microstimulation has been extensively used for cognitive therapy (Grover et al., 2023) and to treat neurological diseases such as Parkinson's and epilepsy (Wu et al., 2021) and to develop prosthetic devices for visual (Schmidt et al., 1996; Torab et al., 2011; Davis et al., 2012), somatosensory (Berg et al., 2013; Tabot et al., 2013; Thomson et al., 2013; Hughes et al., 2021; Urdaneta et al., 2021; K. Ding et al., 2024), and motor systems (Penfield and Boldrey, 1937; Sironi, 2011; Zimnik and Churchland, 2021). Although focused on different brain circuits and applications, one of the most direct tests for the effectiveness of microstimulation is its causal effects on choice behavior.
However, the influence of microstimulation on behavior is often inconsistent and associated evidence inconclusive, as divergent behaviors emerge across different experimental settings and animal models and even effects change over extended periods within the same study (Salzman et al., 1992; Murasugi et al., 1993; Chamberlin and Saper, 1994; Hanks et al., 2006; Griffin et al., 2009; Plow et al., 2009; Watanabe and Munoz, 2010, 2011; Elsner et al., 2020). This inconsistency has been hypothesized to be due to many factors, including differences in characteristics and protocols of the microstimulation (Murasugi et al., 1993), technological and experimental design limitations (Doty, 1969; Jazayeri and Afraz, 2017), habituation, and compensation by the corresponding brain areas in the opposite hemisphere (Doty, 1969). Interestingly, these inconsistencies can also be viewed as robustness of behavior to microstimulation. Nevertheless, it is relatively easy to induce a behavioral shift locally (in time) when microstimulation is applied.
For example, previous studies involving electrical microstimulation of various structures within the oculomotor system, such as the superior colliculus (SC; Carello and Krauzlis, 2004), frontal eye field (FEF; Gold and Shadlen, 2000; Schafer and Moore, 2007; Murd et al., 2020), lateral intraparietal cortex (LIP; Hanks et al., 2006), supplementary eye field (SEF; Berdyyeva and Olson, 2014), and a combination of SC and FEF (Gardner and Lisberger, 2002), have demonstrated that on stimulated trials, microstimulation induces a local bias in target selection toward the response field (RF) of the stimulated neurons. Critically, most of the tasks used in these studies involve reward feedback to motivate appropriate behavioral responses. Therefore, the overall behavior might remain robust to microstimulation due to interactions between reward feedback and microstimulation. Interestingly, activity in most areas of the oculomotor system is modulated by the reward value of the upcoming saccade including SC (Ikeda and Hikosaka, 2003), FEF (L. Ding and Hikosaka, 2006), LIP (Platt and Glimcher, 1999; Sugrue et al., 2004; Seo et al., 2009), and SEF (Donahue et al., 2013; Chen and Stuphorn, 2015). Therefore, the local behavioral bias caused by microstimulation of the above brain areas may interact with reward-value signals, and this could happen within the same areas.
Similarly, although a meta-analysis of studies on cognitive therapy using electrical stimulation indicates that such interventions can enhance cognitive functions such as attention and memory, the effects on motor learning and decision-making remain inconclusive (Grover et al., 2023). Interestingly, both motor learning and decision-making tasks often involve reward feedback on a trial-by-trial basis. This suggests that inconsistency in the effectiveness of electrical stimulation for these tasks might be due to the interaction between stimulation and reward feedback.
To examine the interaction between reward feedback and electrical stimulation and its impact on learning and choice behavior, we analyzed data from microstimulation of the FEF during a dynamic value-based choice task in which monkeys chose between two alternative options that provided rewards with different probabilities. We compared the effects of microstimulation on choice behavior between trials in which microstimulation occurred, the subsequent trials with no microstimulation, and on average. Additionally, to capture our experimental findings and reveal possible neural mechanisms underlying the interaction between reward signals and microstimulation and its effect on target selection, we constructed multiple alternative network models of subcortical and cortical circuits within the oculomotor system to simulate choice behavior under different experimental conditions.
Materials and Methods
Experimental design and statistical analysis
Two male monkeys (Macaca mulatta) weighing 6 kg (Monkey 1) and 11 kg (Monkey 2) were used in the experiments. All surgical and behavioral procedures were approved by the Stanford University Administrative Panel on Laboratory Animal Care and the consultant veterinarian and were in accordance with the National Institutes of Health and Society for Neuroscience instructions. More details on the experimental design, including stimulus presentation, data acquisition, analysis of eye movements, and delivery of electrical stimulation are reported previously (Schafer and Moore, 2007). Some nonoverlapping findings based on this dataset have been published previously (Schafer and Moore, 2007; Soltani et al., 2021).
Visual stimuli
Saccade targets were drifting sinusoidal gratings within stationary, 5–8° Gaussian apertures. Gratings had Michelson contrast between 2 and 8% and a spatial frequency of 0.5 cycle/s°. Drift speed was 5°/s in a direction (up or down) perpendicular to the saccade required to acquire the target (left or right). These parameters were held constant during a session of the experiment. Targets were identical on each trial except for the drift direction (i.e. up vs down) selected randomly for each target.
The choice task with a dynamic reward schedule
Following fixation on a central fixation spot, there was a variable delay (200–600 ms) before the two targets appeared on the screen (Fig. 1A). Targets appeared simultaneously, equidistant from the fixation spot, and opposite to each other. Using a similar convention for the control and microstimulation conditions, we call these two targets
Dynamic value-based choice task and subthreshold microstimulation of the FEF. A, Task design. A fixation point appeared on the screen on each trial, followed by the presentation of two drifting-grating targets. Monkeys indicated their choice with a saccade to one of the two targets; however, the targets were removed as soon as the eyes left the fixation point. A juice reward was delivered on a variable schedule following each saccade.
Overall, two monkeys completed 32 microstimulation sessions (19 and 13 sessions for Monkeys 1 and 2, respectively) consisting of 7,693 trials (4,554 and 3,139 trials for Monkeys 1 and 2, respectively). They also completed 160 sessions of the dynamic choice task with no microstimulation (control condition; 74 and 86 sessions for Monkeys 1 and 2, respectively). Each session consisted of an average of 140 and 370 trials for Monkeys 1 and 2, respectively. The control and microstimulation conditions were performed in separate sessions.
Reward schedule
Following each correct saccadic choice, the monkey received a juice reward according to a dynamic schedule (Abe and Takeuchi, 1993). The probability of reward given a selection of
Based on the configuration of the reward schedule, the reward probabilities on choosing
Electrical microstimulation
Electrical stimulation of an FEF site was delivered through tungsten electrodes, while current amplitude was measured via the voltage drop across a 1 kOhm resistor in series with the return lead of the current source. First, the FEF was localized on the basis of its surrounding physiological and anatomical landmarks and the ability to evoke fixed-vector, saccadic eye movements with (suprathreshold) stimulation using currents below 50 µA at a frequency of 200 Hz (0.3 ms pulse duration, 100 ms trains). Threshold current was determined using a separate calibration task. During the stimulated trials of the choice task, subthreshold microstimulation was delivered at threshold current (±2 µA) at a lower frequency of 60 Hz. Thresholds for evoking saccades were measured again after the experimental sessions, and sessions with significantly different current thresholds were excluded from analyses.
The location of the endpoints of evoked saccades due to suprathreshold microstimulation was then expressed as the center of the FEF site's RF. To target sessions in which microstimulation was efficacious but did not simply drive the monkey's eyes toward one target, we analyzed sessions in which microstimulation changed the probability of
During sessions of saccadic choice trials that included microstimulation (microstimulation condition), target
Statistical analysis
Statistical tests used for comparing different conditions are reported in the text, along with the p value and effect size. Unless otherwise specified, the two-sided Wilcoxon sign test or Wilcoxon signed-rank test is used to compare paired data, whereas the two-sided Wilcoxon rank-sum test is used to compare independent data.
Reinforcement learning models used for fitting choice data
We used various reinforcement learning (RL) models to fit choice behavior in control and microstimulation conditions. The details for all the RL models are previously reported (Soltani et al., 2021). We found that choice behavior in control and microstimulation sessions can best be explained using the same RL model. In this RL model, the locations of the
At the end of each trial, the subjective reward values of both chosen and unchosen targets were updated based on the reward outcome of that trial. More specifically, while subjective reward values of both chosen and unchosen targets were discounted in the following trial, the value of chosen target was further updated depending on the reward outcome. For example, if
Network models
Multiple brain areas involved in saccadic choice (including the FEF) also receive dopaminergic input, and the interaction between these areas determines adaptive target selection. To model this interaction in the context of FEF microstimulation, we considered a model that includes a second target-selection circuit in addition to the FEF circuit. We only included two target-selection circuits to avoid the complexity associated with the interaction between three or more circuits. Therefore, the network model consisted of a total of three circuits: the valuation circuit corresponding to a circuit estimating reward probability and two target-selection circuits corresponding to two areas in the oculomotor system.
The valuation circuit consisted of two pools of value-encoding neurons that were activated upon the presentation of visual targets and projected to the corresponding pools of neurons in the target-selection circuits. The reward value (or reward probability) associated with each target was encoded in two sets of plastic synapses onto the value-encoding pools. At the end of each trial, these neurons received signals indicating whether
The two target-selection circuits simulated two cortical areas involved in saccadic target selection, one of which was the FEF that was the site of microstimulation in our experiment. For simplicity of the modeling, we considered similar properties for the two target-selection circuits. Each target-selection circuit consisted of two pools of excitatory neurons selective for the two targets and one nonselective pool of inhibitory neurons. The selective pools received excitatory inputs from the corresponding value-encoding pools in the valuation circuit and inhibitory input from a shared inhibitory pool of neurons within each circuit. This architecture enabled each circuit to select between the two targets based on the reward value of the two targets (Soltani and Wang, 2006). Since the sensory inputs to value-encoding neurons were similar, the only factor differentiating the inputs to decision neurons was the strength of the plastic synapses from the sensory neurons onto the value-encoding neurons. As has been shown before, the selection behavior of each decision circuit can be fit as a sigmoid function of the difference in the inputs (Soltani et al., 2006; Soltani and Wang, 2006, 2010) as follows:
To simulate the microstimulation experiment, we assumed that the pool of decision neurons selective for
Learning reward probabilities in the valuation circuit
The input currents from the value-encoding neurons depend on factors including the synaptic strength, total number of plastic synapses in each neuron, presynaptic firing rate, peak conductance of the potentiated and depressed states, and the time constant of input synaptic current (Soltani et al., 2006; Soltani and Wang, 2006). However, except for the synaptic strength, all other factors are similar for the two value-encoding pools and could be factored into a single variable k (in mA). Therefore, the selection behavior (Eq. 5) can be expressed as a function of the difference in the synaptic strengths of the neurons encoding the reward probability for
At the end of a given trial, according to the trial's choice and reward outcome, the synaptic strengths for
Alternative mechanisms for the effects of microstimulation
Here, we constructed three general models for the effects of microstimulation on target selection. This exploration was done for two main reasons. First, the effects of microstimulation on neural activity are very complicated processes that depend on the sensitivities of different neural elements (i.e., dendrites, axons, initial segments) as well as the properties of single neurons and the connectivity of neurons within the stimulated network (Tehovnik et al., 2006; Histed et al., 2009). Second, our goal was not to capture the effect of microstimulation on neural activity but to find general mechanisms that result in the observed patterns of adaptation of microstimulation effects. Therefore, to simulate our experimental observations and identify the most plausible mechanism, we started with the least complex model (i.e., the model with no adaptation to microstimulation). Based on the simulation results, we progressively increased the model's complexity (see details below).
The model with no adaptation to microstimulation
In this most basic model, the effect of the microstimulation on decision-making was simulated as a constant extracellular input current,
The second target-selection circuit received the same input currents from the value-encoding neurons as the FEF target-selection circuit. However, to simulate cross-area compensatory mechanisms in the oculomotor system, we assumed that microstimulation in the FEF accompanied by an increase and a decrease in input to the neurons selective for
Based on this compensatory mechanism, we set the probability of
Finally, the outcome choice probabilities of the two target-selection circuits are combined to determine the final choice probability for selection between the two targets,
Model with desensitization
In this model, microstimulation affected the
Specifically, after each microstimulation, the depression variable
The model with reward-dependent adaptation
This model was built upon the previous model (i.e., the model with desensitization) with one difference: in this model, the efficacy of the microstimulation current,
More specifically, we assumed this efficacy consisted of two components: a fixed component that depended on network dynamics and a variable component capturing the combination of network and single-cell adjustments to microstimulation as follows:
To maintain the same level of activity, the efficacy of
Moreover, the adaptive part of the microstimulation current efficacy in the FEF circuit was updated as follows:
Models without oculomotor system interaction
To determine whether our experimental data could be replicated by models with a single target-selection circuit, we simulated three such models even though these models do not align with existing knowledge of the oculomotor system. These include a base model with no adaptation to microstimulation, a model with desensitization, and a model with reward-dependent adaptation. These network models consisted of two circuits: the valuation circuit, which corresponded to a circuit estimating reward probability, and one target-selection circuit, which corresponded to the FEF. The valuation circuit and the target-selection circuit were similar to the circuits for the network models considering cross-area compensation (see above). We made three models based on the model's adaptation to microstimulation similar to the models with cross-area compensation. These three models were the base model with no adaptation to microstimulation, the model with desensitization, and the model with reward-dependent adaptation.
Estimation of confidence
To estimate monkeys’ confidence on a given trial, we employed two methods. In the first method, we used the reaction time of each trial as a proxy for confidence, a method commonly used in the literature (De Martino et al., 2013; Stolyarova et al., 2019). The trials with faster reaction times were associated with higher confidence, whereas those with slower reaction times indicated lower confidence. Therefore, we used the median reaction time from each session to classify trials into high-confidence (HC) and low-confidence (LC) categories depending on whether a given trial had a reaction time faster or slower than the median reaction time, respectively.
In the second method, we defined HC and LC trials based on the difference between the values of chosen and unchosen targets, as estimated by the best-fitting RL model. Specifically, we used the median difference between the values of two targets (chosen-unchosen) in each trial to divide trials into HC and LC based on whether this difference was smaller or larger than the median difference. The rationale behind this categorization was that a larger difference between the values of the two targets increases the likelihood that the higher-value target would be chosen, corresponding to easier decision-making and, thus, higher confidence. We performed this analysis using estimated values from the RL model rather than actual reward values because these estimates reflect subjective reward expectations, making them more relevant to confidence assessments. Because both methods yielded qualitatively similar results and reaction time is more commonly used to estimate confidence, we have opted to present only the results based on this method.
Results
Local effects of microstimulation on saccadic choice
To examine whether microstimulation of the oculomotor system and reward feedback interact, we analyzed data from a dynamic value-based decision-making task in which monkeys were trained to select between one of the two targets that appeared on the screen simultaneously. The monkeys freely selected between the two targets using saccadic eye movements under control and microstimulation conditions (Fig. 1A). In the control sessions, following the selection of a target, the monkey received a juice reward according to a dynamic schedule that was a function of the local fraction of choosing that target in the past 20 trials. In each session, one target was globally more rewarding, but this reward decreased every time the monkey chose that target (Eqs. 1–2; Fig. 1C–E). Therefore, to successfully perform this task, the monkeys had to consider choice and reward history to estimate the reward probability associated with the two targets.
The microstimulation sessions were identical to the control ones except that in half of the randomly selected trials, subthreshold microstimulation was delivered at the site of the FEF, beginning when the targets were presented and lasting for 200 ms. Accordingly, one of the two targets
To test the local effects of microstimulation on target selection, we divided trials in the microstimulation condition into those with and without microstimulation (MS+ and MS− trials, respectively). We found that microstimulation biased target selection on a trial-by-trial basis. Specifically, we found that monkeys chose
Overall target selection and changes in saccadic choice in response to the FEF microstimulation. The percentage of
Despite increased
To further examine the local effects of microstimulation on target selection, we evaluated the effect of microstimulation in nonstimulated trials following a microstimulated trial. To that end, we calculated the change in the probability of
Evidence for compensatory mechanisms for microstimulation effects and interactions between microstimulation and reward feedback. A, Change in
To examine whether this microstimulation-dependent adaptation varied as a function of the number of consecutive microstimulations, we computed the change in
We hypothesized that this microstimulation-dependent adaptation and the lack of overall bias toward
We found that the difference between the probability of win–stay and lose–switch was more negative for both monkeys in the microstimulation compared with the control condition (Wilcoxon rank-sum test; both monkeys,
Global effects of microstimulation on saccadic choice
To further test the hypothesis that the interaction of reward feedback and microstimulation could offset the local effects of microstimulation on saccadic choice, resulting in no overall effect, we assessed the global effects of microstimulation on overall task performance. First, we calculated the overall harvested reward as a function of the global reward parameter (r). Despite individual variations, we found no significant differences between the control and microstimulation conditions (Figs. 4A,D, 5). We also tested whether within the microstimulation sessions, microstimulation of the FEF changed the overall response of the monkeys to the reward schedule. To that end, we examined the slope of
Overall target selection and changes in saccadic choice in response to the FEF microstimulation, separately for the two monkeys. A, Harvested reward per trial as a function of the global reward parameter for zero penalty sessions for Monkey 1. Dark (light) blue circles correspond to the control (microstimulation) sessions. The solid-colored lines show fit using a quadratic function. B, The percentage of
Comparison of the overall harvest rate between control and microstimulation conditions. Plotted are estimated values and confidence intervals for the overall harvested reward as a function of the global reward parameter using a quadratic function (harvested reward
We then compared overall target selection between the control and microstimulation conditions to examine the relationship between choice and reward fractions (i.e., matching behavior). Overall, we found that both monkeys followed the reward schedule similarly in the two conditions as reflected in the slope of choice versus reward fractions [slope (
m) of choice vs reward fractions and its confidence interval: Monkey1,
By comparing choice and reward fractions across sessions, we found that monkeys exhibited strong undermatching in both control and microstimulation conditions (Fig. 4C,F). That is, the relative selection of the more rewarding target was smaller than the relative reinforcement obtained on that target [Monkey 1 control, median(choice fraction − reward fraction) =
Effects of microstimulation on learning
To examine whether microstimulation influenced learning, we fit the choice behavior in control and microstimulation conditions using different RL models. We found that a similar RL model best explained the choice behavior of both monkeys in the control and microstimulation sessions (see Materials and Methods for details). Interestingly, we did not find a significant difference between the discount factors of the RL model in the control and microstimulation conditions for either monkey [Monkey 1, difference (stim–control) in discount factor
Together, the above analyses of monkeys’ choice behavior demonstrate that although microstimulation induced an overall bias in target selection on MS+ trials, there was no evidence that it impacted the monkeys’ ability to learn and perform the task effectively. These findings suggest that despite the local effects of FEF microstimulation on target selection, the oculomotor system and related circuits were able to compensate for these changes using reward feedback, thereby maintaining robust overall saccadic choice behavior in the face of perturbation.
Neural mechanisms underlying adjustments of saccadic choice to microstimulation
To account for our observations, we constructed several computational network models to explore alternative plausible mechanisms. The basic model consists of three circuits: two target-selection circuits and a valuation circuit. The reward values of the two targets are encoded in two separate sets of plastic synapses in the valuation circuit. These synapses are updated at the end of each trial depending on the choice of the network and reward feedback. The two target-selection circuits represent two brain areas involved in saccadic choice: the FEF that was the site of microstimulation in our experiment and another area in the oculomotor system (e.g., LIP, SEF, etc.). Each target-selection circuit consists of two excitatory neuronal populations that are selective to the two targets and one nonselective inhibitory neuronal population. The excitatory pools of neurons in each target-selection circuit receive excitatory inputs from the corresponding pools of neurons in the value-encoding circuit and use these inputs to make decisions (see Materials and Methods for more details).
This configuration with two target-selection circuits allowed us to simulate the interaction between the FEF and another oculomotor area. We simulated this interaction based on previous findings that microstimulation of FEF can cause both excitation and inhibition in the contralateral FEF (Schlag-Rey et al., 1992; Schlag et al., 1998). Specifically, the contralateral neurons that became excited displayed response selectivity similar to that of the stimulated FEF neurons. In contrast, the neurons that were inhibited showed response selectivity that was opposite to that of the stimulated FEF neurons (see Materials and Methods for details).
To reveal the underlying mechanism of the observed effects of microstimulation on choice behavior, we implemented different mechanisms by which microstimulation could impact the behavior. To that end, we progressively increased the complexity of our models, starting from a base model with no adaptation to microstimulation, then to a model with desensitization to microstimulation, and, finally, to a model with reward-dependent adaptation to microstimulation (see Materials and Methods for more details).
A distinct effect of microstimulation at an oculomotor site is a biasing of saccades toward the object placed in the receptive field of the stimulated site (Gold and Shadlen, 2000; Carello and Krauzlis, 2004; Schafer and Moore, 2007; Murd et al., 2020). Therefore, we tested whether a model in which microstimulation increased the input to the target in the receptive field of the stimulated site
The schematic and choice behavior of the model with no adaptation and the model with desensitization. A, Both models consisted of three circuits: the valuation circuit and two target-selection circuits corresponding to the FEF and another area in the oculomotor system. The valuation circuit consisted of two pools of value-encoding neurons that were activated upon the presentation of visual targets and projected to the corresponding pools of neurons in the two target-selection circuits. The reward probability associated with two alternative actions was encoded in two sets of plastic synapses onto the value-encoding pools. On microstimulation trials (MS+), the pool of neurons selective for
We found that this model with no adaptation showed a decrease in
The first model with adaptation to microstimulation was based on the evidence showing that neurons in the central nervous system can exhibit “fatigue” (Ye et al., 2012) or an increased refractory period as a response to stimulation (Feng et al., 2014). We simulated these effects by including depression or desensitization in the inputs to Tin neurons. More specifically, in this model, referred to as the model with desensitization, the microstimulation caused a decrease in sensitivity of the FEF
We found that the model with desensitization could account for the adaptation of the bias away from the
The shortcomings of the above two models suggest that the observed adaptation of the selection bias on MS− trials may require a reward-dependent adaptation of the target-selection network to microstimulation and, thus, an interaction between microstimulation and reward feedback. Therefore, we next asked whether an adaptation of microstimulation effects that depends on the presence or absence of reward could account for our observations. The simulation of this effect was inspired by findings that homeostatic activity regulation, activated by stimulation, can reduce hyperexcitability in brain networks (Chai et al., 2019). However, because mechanisms underlying such homeostatic plasticity are not fully understood, we implemented this effect by creating opposing adaptations in the efficacy of microstimulation and in the strength of input to
The schematic and choice behavior of the dynamic model with reward-dependent adaptation with opposing adjustments for microstimulation efficacy and endogenous reward inputs to the FEF target-selection circuit. A, The model consisted of three circuits: one valuation circuit and two target-selection circuits corresponding to the FEF and another area in the oculomotor system. The valuation circuit consisted of two pools of value-encoding neurons that were activated upon the presentation of visual targets and projected to the corresponding pools of neurons in the two target-selection circuits. The reward probability associated with two alternative actions was encoded in two sets of plastic synapses onto the value-encoding pools. On microstimulation trials (MS+), the pool of neurons selective for
Importantly, such opposing adaptations predict that the model should respond to reward and no reward differently after trials that
Both predictions (or postdictions) were confirmed by our experimental data (Fig. 3C,D). First, both monkeys utilized the win–stay strategy more often on
Because microstimulation increased
An alternative explanation for the observed interaction between microstimulation and response to reward feedback could be the influence of subjective confidence. Confidence in a choice can affect monkeys’ win–stay and lose–switch strategies, and this effect could be different depending on the choice (
After categorizing trials into HC or LC, we tested whether win–stay and lose–switch were different following these types of trials and across the two experimental conditions (i.e., control vs stimulation). We found that in the control condition, the monkeys used win–stay strategy more often on HC trials, suggesting that the monkeys stayed more on the rewarded choice target when they were more confident about their choices (Fig. 8A). Consistently, the monkeys were less likely to switch from the nonrewarded choice target after HC trials (Fig. 8B). As expected, these effects were similar for
Dependence of win–stay and lose–switch strategies on confidence during control sessions. HC and LC trials are defined using median reaction time. A, Plot shows the probability of win–stay separately for HC (circle) and LC (square) trials and for the
As mentioned above (Fig. 3C,D), the win–stay and lose–switch during the stimulation condition were influenced by both stimulation and the choice such that the monkeys stayed less on
Difference in win–stay and lose–switch strategies between stimulated and nonstimulated trials was not affected by confidence level. A, The plot shows the difference in win–stay is plotted in HC trials in which microstimulation occurred (MS+) and the trials in which no microstimulation (MS−) occurred. B, The plot shows the difference in win–stay is plotted in LC trials in which microstimulation occurred (MS+) and the trials in which no microstimulation (MS−) occurred. C, D, Similar to panels A and B, but for lose–switch.
Finally, we asked whether the cross-area compensation was essential for replicating the experimental results. To that end, we created three models with a single target-selection circuit, even though such models are not consistent with existing knowledge of the oculomotor system. More specifically, we simulated three models: a base model with no adaptation to microstimulation, a model with desensitization, and a model with reward-dependent adaptation (see Materials and Methods for more details). Interestingly, we found that only the model with reward-dependent adaptation was able to replicate the pattern of our experimental data (Fig. 10). These results suggest that the interaction between reward and microstimulation through reward-dependent adaptation is a critical mechanism for capturing the experimental data. However, the presence of a second target-selection circuit remains a necessary component for a biologically plausible network model. Together, our experimental and modeling results provide strong evidence that robust saccadic choice relies on the interaction between adaptation to microstimulation and endogenous reward signals within the oculomotor systems and related circuits.
Behavior of models with a single target-selection circuit. Plots across rows A–D capture: change in
Discussion
By applying microstimulation during a dynamic value-based learning task, we found a specific pattern of change in choice behavior in response to microstimulation and reward feedback. More specifically, subthreshold electrical microstimulation of the FEF increased the selection of a target within the RF
It is possible that the observed pattern of response to microstimulation was partially influenced by the dynamic reward schedule used in our experiment, where the probability of reward on subsequent selection of the same target gradually decreased. However, similar long-lasting effects of microstimulation on nonstimulated trials have been previously reported for microstimulation of SC during a task in which the statistics of correct response were manipulated across a block of trials (Crapse et al., 2018). In addition, simulation results from the model with no adaptation demonstrate that reward feedback alone cannot fully explain our experimental findings. This suggests that the absence of an overall bias toward
The network model that best captured our experimental results included a competitive mechanism for adjustment of microstimulation efficacy and reward input in the oculomotor systems and related circuits. This suggests that microstimulation signals into the FEF compete with signals from the endogenous reward system, allowing each pathway to be upregulated or downregulated at the expense of the other. Moreover, this model can account for general characteristics of the effects of external perturbations on behavior and explains some of the more idiosyncratic effects of microstimulation in other studies. For example, Ni and Maunsell (Ni and Maunsell, 2010) found that monkeys could be trained to detect microstimulation of V1 corresponding to particular retinotopic locations. Surprisingly, however, training to detect V1 microstimulation caused a profound impairment in the monkeys’ ability to detect veridical visual stimuli at the same visual location, and retraining on visual detection impaired the detection of microstimulation. Although we observed adaptation on the order of several trials rather than tens or hundreds of trials, as in Ni and Maunsell (Ni and Maunsell, 2010), our model is able to explain their results qualitatively.
The proposed adaptive and compensatory mechanism due to the interaction between microstimulation and reward signals within the oculomotor system can provide an alternative explanation for inconsistent effects of electrical microstimulation on choice behavior or, equivalently, how the oculomotor system is robust against internal and external perturbations. For example, a recent meta-analysis study on the utilization of electrical stimulation for cognitive therapy (Grover et al., 2023) concluded improvements in working memory and attention with electrical stimulation. In contrast, they found that electrical stimulation did not cause significant modulations of cognitive functions such as motor learning and decision-making. Our results are consistent with these findings because goal-driven and reward-dependent processes involved in learning and decision-making tasks are likely to be more robust to electrical stimulation due to stronger interaction between stimulation and reward signals during those tasks.
Our implementation of a homeostasis-like mechanism activated by microstimulation sheds light on the paradoxical observation that electrical stimulation can reduce hyperexcitability in brain networks, a phenomenon associated with various neurological disorders such as epilepsy and neuropathic pain (Chai et al., 2019). Interestingly, both electrical stimulation and medications that inhibit neuronal activity have been found to reduce this hyperexcitability effectively. This suggests that the increased excitatory currents resulting from electrical stimulation may be counteracted by further adaptations within the same circuits. Indeed, several studies have proposed that the therapeutic effects of stimulation in hyperexcitability-related neural disorders are attributed to homeostatic activity regulation resulting from the stimulation (Chai et al., 2019).
Finally, our findings of reward-dependent, adaptive, and compensatory mechanisms in target selection underscore the need for caution in understanding and interpreting the results of studies that employ stimulation for neurorehabilitation following brain damage (Plow et al., 2009; Dolbow et al., 2014; Cappon et al., 2016) and use stimulation to modulate cognitive processes such as memory (Antonenko et al., 2013; Alekseichuk et al., 2016; Aaronson et al., 2021), attention (Clayton et al., 2018; Dallmer-Zerbe et al., 2020), executive control (Borghini et al., 2018; Bramson et al., 2020), motor learning (Giustiniani et al., 2019; Harada et al., 2020), and learning and decision-making (Sela et al., 2012; Zavecz et al., 2020) or to treat depression (Shekelle et al., 2018). More specifically, the majority of stimulation protocols involve participants receiving reward feedback, which can be explicit, like reward points, or implicit, such as a correct/incorrect message displayed on the screen. As demonstrated in our findings, such feedback can competitively interact with stimulations, systematically reducing their overall effect on behavior. Moreover, the individual variability observed in our study suggests that the interaction between stimulation and reward feedback may lead to more idiosyncratic outcomes, making the interpretation of these results specific to each individual. These indicate that to accurately identify the effects of stimulation, both local and global impacts of stimulation must be analyzed and understood in relation to reward outcomes on specific stimulation trials.
Footnotes
We thank Chanc VanWinkle Orzell for her helpful comments on the manuscript. This work is supported by the National Science Foundation (CAREER Award BCS1943767 to A.S.) and the National Institutes of Health (NIH EY014924 to T.M.).
The authors declare no competing financial interests.
- Correspondence should be addressed to Alireza Soltani at soltani{at}dartmouth.edu.
This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International license, which permits unrestricted use, distribution and reproduction in any medium provided that the original work is properly attributed.