## Abstract

The synaptic balance between excitation and inhibition (E/I balance) is a fundamental principle of cortical circuits, and disruptions in E/I balance are commonly linked to cognitive deficits such as impaired decision-making. Explanatory gaps remain in a mechanistic understanding of how E/I balance contributes to cognitive computations, and how E/I disruptions at the synaptic level can propagate to induce behavioral deficits. Here, we studied how E/I perturbations may impair perceptual decision-making in a biophysically-based association cortical circuit model. We found that both elevating and lowering E/I ratio, via NMDA receptor (NMDAR) hypofunction at inhibitory interneurons and excitatory pyramidal neurons, respectively, can similarly impair psychometric performance, following an inverted-U dependence. Nonetheless, these E/I perturbations differentially alter the process of evidence accumulation across time. Under elevated E/I ratio, decision-making is impulsive, overweighting early evidence and underweighting late evidence. Under lowered E/I ratio, decision-making is indecisive, with both evidence integration and winner-take-all competition weakened. The distinct time courses of evidence accumulation at the circuit level can be measured at the behavioral level, using multiple psychophysical task paradigms which provide dissociable predictions. These results are well captured by a generalized drift-diffusion model (DDM) with self-coupling, implementing leaky or unstable integration, which thereby links biophysical circuit modeling to algorithmic process modeling and facilitates model fitting to behavioral choice data. In general, our findings characterize critical roles of cortical E/I balance in cognitive function, bridging from biophysical to behavioral levels of analysis.

**SIGNIFICANCE STATEMENT** Cognitive deficits in multiple neuropsychiatric disorders, including schizophrenia, have been associated with alterations in the balance of synaptic excitation and inhibition (E/I) in cerebral cortical circuits. However, the circuit mechanisms by which E/I imbalance leads to cognitive deficits in decision-making have remained unclear. We used a computational model of decision-making in cortical circuits to study the neural and behavioral effects of E/I imbalance. We found that elevating and lowering E/I ratio produce distinct modes of dysfunction in decision-making processes, which can be dissociated in behavior through psychophysical task paradigms. The biophysical circuit model can be mapped onto a psychological model of decision-making which can facilitate experimental tests of model predictions.

- computational model
- decision making
- drift-diffusion model
- excitation-inhibition balance
- NMDAR hypofunction
- psychophysics

## Introduction

The synaptic balance of excitation and inhibition (E/I balance) is a fundamental principle of neural dynamics and computational function in cortical circuits. At the behavioral level, perturbations of cortical E/I balance through various experimental methods, including pharmacology and optogenetics, can induce deficits across a range of cognitive functions (Yizhar et al., 2011). E/I alterations are proposed to contribute to cognitive deficits which are prominent in multiple neuropsychiatric disorders, including schizophrenia and autism (Krystal et al., 2003; Kehrer et al., 2008; Lisman et al., 2008; Marin, 2012; Nakazawa et al., 2012; Zikopoulos and Barbas, 2013; Gao and Penzes, 2015; Lee et al., 2016; Sohal and Rubenstein, 2019). However, there remain critical gaps in across-level understanding of computational mechanisms by which synaptic-level perturbations alter circuit dynamics to impact cognitive function.

A core cognitive function altered in neuropsychiatric disorders is decision-making. One theoretical framework for decision-making computations is based the accumulation of evidence across time to form a categorical choice (Gold and Shadlen, 2007; for nonaccumulation frameworks, cf. Cisek et al., 2009; Stine et al., 2020). In an influential two-alternative forced-choice (2AFC) discrimination paradigm, the subject must report the net direction of motion in a random-dot stimulus, which promotes decision-making based on integration across time of momentary perceptual evidence (Shadlen and Newsome, 2001; Roitman and Shadlen, 2002; Gold and Shadlen, 2007; Kayser et al., 2010). Neural recordings have revealed coding of momentary evidence in sensory cortical regions (Britten et al., 1992; Gold and Shadlen, 2007), whereas neural activity in a distributed network including association cortical areas represents accumulation of evidence and choice formation (Shadlen and Newsome, 2001; Roitman and Shadlen, 2002; Gold and Shadlen, 2007). Random-dot motion decision-making paradigms have been tested in patients with schizophrenia and autism, and found that both clinical populations exhibit decision-making deficits (Milne et al., 2002; Chen et al., 2003, 2004, 2005; Koldewyn et al., 2010). These behavioral deficits could potentially arise from differences in the evidence accumulation process in association cortical circuits.

Computational modeling frameworks have provided valuable insights into the dynamical mechanisms of decision-making. The drift-diffusion model (DDM), which describes the integration of evidence as a diffusion process biased by an evidence-related drift rate, with decision committed when the accumulated evidence reaches a threshold, has been widely applied to fit decision-making behavior (Ratcliff, 1978; Ratcliff and Rouder, 1998; Gold and Shadlen, 2007). At the level of neural circuit mechanisms, biophysically-detailed models of association cortical circuits can capture key behavioral and neurophysiological features during decision-making (Wang, 2008). In the model of Wang (2002), populations of selective excitatory neurons receive separate input streams. Each population accumulates evidence through ramping activity, because of strong recurrent excitation mediated by NMDA receptors (NMDARs), and laterally inhibit each other via a population of GABAergic inhibitory interneurons, resulting in winner-take-all competition and categorical choice (Wong and Wang, 2006). As biophysical models provide circuit mechanisms underlying a cognitive function, their synaptic-level detail enables study of behavioral effects of E/I perturbations.

Here, we characterized how disruptions of cortical E/I balance affect decision-making function in a spiking circuit model (Wang, 2002). We found that both elevated and lowered E/I ratio can similarly impair decision-making function, when assessed by psychometric performance in a standard fixed-duration paradigm. However, the two perturbed E/I regimes make dissociable behavioral predictions for psychophysical paradigms that characterize the time course of evidence accumulation (Huk and Shadlen, 2005; Kiani et al., 2008; Nienborg and Cumming, 2009). These regimes could be well described by a generalized DDM that incorporates an imperfect integration process for evidence accumulation (Bogacz et al., 2006; Brunton et al., 2013; Shinn et al., 2020). This study links synaptic disruptions to cognitive dysfunction in well-established perceptual decision-making paradigms, and makes empirically testable predictions for cognitive task behavior under elevated versus lowered E/I ratio in association cortex.

## Materials and Methods

#### Spiking circuit model

A biophysically-based spiking circuit model is used to represent an association cortical circuit capable of decision-making computations. The model used is specified in full in Wang (2002), with changes from that model described below. In brief, the circuit model consists of *N _{E}* = 1600 excitatory pyramidal neurons and

*N*= 400 inhibitory interneurons, all simulated as leaky integrate-and-fire neurons, which are recurrently interconnected to each other. Recurrent excitatory connections are mediated by both NMDAR and AMPAR conductances, and recurrent inhibition is mediated by GABA

_{I}_{A}

*R*conductances. Background and stimulus inputs are mediated by AMPAR conductances with Poisson spike trains. Within the excitatory neuron population are two nonoverlapping groups, of size

*N*

_{E}_{,}

*= 240, which are selective to evidence for a choice*

_{G}*A*or

*B*(e.g., left vs right), and compete with each other via lateral inhibition mediated by a population of local interneurons (Fig. 1

*A*). The remaining excitatory neurons are nonselective toward either choice.

Reflecting the aforementioned architecture, the connectivity patterns among excitatory neurons follow a “Hebbian” form as used previously (Brunel and Wang, 2001; Wang, 2002), whereby recurrent projections to neurons in the same selective group have a stronger synaptic strength, scaled up by a factor *w*_{+} > 1. The synaptic strength to neurons in the competing selective group and to nonselective excitatory neurons are scaled down by a factor

#### Stimulus

During stimulus presentation, input signals reflecting sensory evidence, putatively from an upstream area, are projected into the two selective neuron groups in the form of Poisson spike trains, whose differential spike rates represent momentary net sensory evidence in experiments. The strength of perceptual evidence is parameterized continuously by the coherence of the stimulus; 100%-coherence stimuli, corresponding to all perceptual evidence in favor of one choice, are simulated by maximal (minimal) input to the preferred (anti-preferred) population; 0%-coherence stimuli are simulated by equal input to both populations (Fig. 1*A*). In the example of the random dot motion stimuli, a 100%-coherence stimulus corresponds to all dots moving coherently in one direction, whereas a 0%-coherence stimulus corresponds to dots moving incoherently with no net global motion (Britten et al., 1992; Shadlen and Newsome, 2001; Roitman and Shadlen, 2002). To simulate the momentary sensory representation of motion signals from a random-dot motion stimulus, such as in area MT (Britten et al., 1992), spike rates for sensory inputs to the two groups of excitatory neurons are given by:
_{0} is the overall stimulus strength, ρ is the upstream modulation parameter set to 1 by default, and *c ^{′}* is the stimulus coherence (Wang, 2002). Of note, this idealization of symmetric coherence dependence in Equation 1 is not exhibited by many neurons in area MT, which tend to show a larger increase for preferred motion direction than decrease for anti-preferred motion direction (Britten et al., 1993).

#### Choice readout

Simulation of the circuit model outputs spike-train data for the two excitatory populations, which are then converted to population activity, with a 0.001-s time-step, temporally smoothed by a causal exponential filter. In particular, for each spike of a given neuron, the histogram-bins corresponding to times before that spike receives no weight, while the histogram-bins corresponding to times after the spike receives a weight proportional to *t* is the time of the histogram-bin after the spike, and τ* _{filter}* = 20 ms. The total weights because of each spike is then normalized to sum to 1.

In general, stimulus input drives categorical, winner-take-all competitions such that the winning population will reach persistent activity of high firing rate (>30 Hz, in comparison to the baseline firing rate of ∼1.5 Hz), while suppressing the activity of the other population below baseline via lateral inhibition. To simulate 2AFC decision-making, a choice is selected when the corresponding population's firing rate crosses a threshold level (here, 15 Hz). It is also possible on a given trial that neither population crosses the decision threshold, which we denote here as an “indecision” trial. In that case, the behavioral response is selected randomly. This choice readout directly follows the implementation of Wang (2002). We set the probability of this random response to A versus B to be equal. If this probability is set asymmetrically, it can capture a subject's bias for one response in the absence of stimulus-driven choice selection. We note that an asymmetric probability of downstream response on indecision trials does not impact the overall choices' time course of sensitivity to evidence.

#### Synaptic perturbations

The synaptic conductance parameters of the spiking circuit model were perturbed, from a “control” parameter set, to examine the impact of alterations in E/I balance. Specifically, E/I perturbations were implemented through hypofunction of NDMARs at two sites: on inhibitory interneurons (I-cells), or on excitatory pyramidal neurons (E-cells; Fig. 1*B*). NMDAR hypofunction on I-cells (reduced *G _{E}*

_{→}

*) results in elevated E/I ratio because of disinhibition. Conversely, NMDAR hypofunction on E-cells (reduced*

_{I}*G*

_{E}_{→}

*) results in lowered E/I ratio. The perturbed models were also compared with an “upstream deficit” model which has weakened incoming signals of momentary sensory evidence. The upstream deficit model is the same as the control model, but with ρ = 0.5 in Equation 1, representing reduced selectivity of sensory coding in upstream sensory areas.*

_{E}The remaining details of the model are described in Wang (2002), with the following adjustments to the original parameters, which were made to enhance the stability of the baseline and persistent-activity states when subject to E/I perturbations. To stabilize the persistent-activity states against reductions of *G _{E}*

_{→}

*, we increased*

_{E}*w*

_{+}to 1.84 from 1.70. To stabilize the baseline state against reductions of

*G*

_{E}_{→}

*, we reduced the external AMPAR conductance (*

_{I}*g*

_{ext}_{,}

*) to 2.07 nS from 2.1 nS. Finally, we changed the stimulus strength μ*

_{AMPA}_{0}to 38 Hz from 40 Hz. For our default synaptic perturbation magnitudes,

*G*

_{E}_{→}

*is reduced by 3% in the “elevated E/I” circuit, and*

_{I}*G*

_{E}_{→}

*is reduced by 2% in the “lowered E/I” circuit. These perturbation magnitudes preserve the stabilities, in the absence of stimulus input, of the low-activity baseline state and the high-activity memory state.*

_{E}#### Stability criteria

The perturbations to E/I balance are such that both the high-activity memory states and the low-activity baseline state are stable over time. To determine the stability of the low-activity baseline state, for various perturbations of *G _{E}*

_{→}

*and*

_{E}*G*

_{E}_{→}

*, we performed 10 sets of 5-s simulations for each condition of*

_{I}*G*

_{E}_{→}

*and*

_{E}*G*

_{E}_{→}

*perturbation, with no stimulus inputs provided to any neurons. The circuit baseline state is labeled unstable if in the majority of the simulations it destabilizes, i.e., one of the two groups of excitatory neurons transitions from the baseline state to the high-activity memory state (>30-Hz firing rate) without stimulus input; otherwise, the circuit baseline state is considered stable. We note that these spiking neural circuits are intrinsically unstable over long time windows, because of finite-size fluctuation effects (as each selective population contains only 240 neurons). That is, given a long enough time, circuits in the low-activity state will sometimes transition to the high-activity state simply because of noise (without external inputs). This follows from the attractor dynamics nature of the system: the high-activity memory states seem have deeper and wider basins of attraction than the low-activity baseline state (see Wong and Wang, 2006).*

_{I}To determine the stability of the high-activity memory states, we performed 500 sets of simulations for each condition of *G _{E}*

_{→}

*and*

_{E}*G*

_{E}_{→}

*perturbations, applying a 2-s, 51.2% coherence stimulus also used in the standard task. For simulations where the stimulus triggered a winning population, if the memory state is not maintained 2 s after stimulus presentation (i.e., both population firing rates end at <15 Hz) in any of the 500 simulations, it is labeled unstable; otherwise, the circuit memory state is considered stable. We selected this stringent criterion for stability of the memory state because of the intrinsic asymmetry of baseline and memory states and the transitions between them. In contrast to the low-activity state, from which the system will naturally jump to the high-activity state because of noise-driven finite-size fluctuations, the high-activity state, as an attractor state is much more stable and the system will very rarely transition to the low-activity state (unless there a strong reduction in recurrent excitation).*

_{I}#### E/I ratio calculation

The E/I ratio in the spiking circuit is calculated from 10 sets of 5-s simulations for each condition of *G _{E}*

_{→}

*and*

_{E}*G*

_{E}_{→}

*perturbation, in the stable baseline state with no stimulus inputs. In the baseline state, the recurrent AMPAR, NMDAR, and GABA*

_{I}_{A}R synaptic input currents to an excitatory neuron group are recorded. The E/I ratio is defined here as the total recurrent excitatory (AMPAR plus NMDAR) synaptic input current divided by the total recurrent inhibitory (GABA

_{A}R) synaptic input current.

#### Psychophysical task paradigms

We used four psychophysical 2AFC task paradigms to characterize decision-making function: one standard task, and three tasks that characterize the time course of evidence accumulation. All paradigms are based on electrophysiological studies of perceptual decision-making in monkeys.

#### Standard task paradigm

Here we refer to the standard task paradigm as one in which a constant-coherence stimulus is applied for a fixed duration of 2 s (Britten et al., 1992; Shadlen and Newsome, 2001; Gold and Shadlen, 2007; Kayser et al., 2010). The coherence varies trial-by-trial from the set of {0%, 3.2%, 6.4%, 12.8%, 25.6%, 51.2%}. The psychometric function, giving the probability of a choice for one option as a function of stimulus coherence *c ^{′}* toward that option, is fit by the functional form:

#### Psychophysical kernel paradigm

The psychophysical kernel paradigm is based on the experimental paradigm of Nienborg and Cumming (2009). On each trial, stimuli are presented for 2 s total, with the coherence for each 0.05-s time bin randomly sampled from a uniform distribution over levels of {±6.4%, ±12.8%, ±25.6%}. The psychophysical matrix (*M _{PK}*) is computed as the difference in probabilities between the two choices for each coherence level at each stimulus time bin, normalized by the magnitude of the corresponding coherence level. The psychophysical matrix is then multiplied by the sign of the coherence and averaged over coherence levels to form the psychophysical kernel. The psychophysical kernel is therefore a form of choice-triggered average. The psychophysical kernel weight

*W*thus characterizes the time course of evidence accumulation, with larger weight at a given stimulus time corresponding to a larger impact of stimuli presented at that time to the resulting behavioral choice. Mathematically, denoting the two choices as

_{PK}*A*and

*B*(

*x*and

_{A}*x*):

_{B}*t*has coherence

*c*. For this paradigm, we simulated 200,000 trials for each circuit model.

^{′}#### Pulse paradigm

In addition to a constant 2-s stimulus of coherence levels as used in the standard task paradigm, a pulse of ±15% coherence strength and 0.1-s duration is applied at various onset times (Kiani et al., 2008). For each pulse onset time, the psychometric function is then fit according to

#### Variable duration paradigm

In the variable duration paradigm, stimuli identical to those in the standard task paradigm are presented, but with durations varying from 0.1 to 2 s across trials (Kiani et al., 2008). For each stimulus duration, the psychometric function (Eq. 2) is computed and fit, yielding a duration-specific discrimination threshold α. The dependence of the discrimination threshold on stimulus duration is then characterized as the time-dependent threshold function (Britten et al., 1992; Wang, 2002; Kiani et al., 2008). For each circuit model, 1000 simulations were performed for each condition of stimulus duration and coherence level.

#### Generalized DDM

We tested whether behavioral effects of disrupted E/I balance in the circuit model can be captured by a generalized DDM (Bogacz et al., 2006; Roxin and Ledberg, 2008; Brunton et al., 2013; Miller and Katz, 2013; Shinn et al., 2020). In particular, we hypothesized that E/I circuit alterations would correspond to alterations of the DDM integration process. We therefore incorporated a self-coupling term λ*x* in the generalized DDM, so that instead of perfect integration as in the standard DDM (λ = 0), the integration process can be leaky (λ < 0) or unstable (λ > 0; Bogacz et al., 2006). In this generalized DDM, a decision variable *x* starts at 0 and diffuses according to the stochastic differential equation:
*dW* is the infinitesimal Wiener process (i.e., a Gaussian noise term of mean 0 and variance *dt*). When *x* reaches one of the decision bounds at *B* = ± 1, the corresponding choice (e.g., left or right) is selected.

#### Implicit Fokker–Planck method for generalized DDMs

While the standard version of DDM has a semi-analytical solution, most versions of generalized DDMs, such as one with imperfect integration as studied here, do not. For efficient simulation of the generalized DDM, which enables fitting model parameters to spiking circuit model data and potentially experimental choice data, we used the Fokker–Planck equation (Brown and Holmes, 2001; Kiani and Shadlen, 2009; Shinn et al., 2020). In particular, Equation 6 can be recast in terms of the probability distribution function [PDF; *p*(*x*,*t*)] of the decision variable *x* at any time *t*:

The Fokker–Planck equation has absorbing boundary conditions at the decision bounds. In the context of the DDM, any probability density of *x* that diffuses out of either absorbing bound ± *B* adds to the corresponding probability committed to the corresponding choice and is no longer considered by Equation 7.

For numerical simulation, Equation 7 can be discretized with step sizes Δ*x* and Δ*t* in (decision variable) space and time. The evolution of the PDF can then be recursively approximated by computing the PDF at each position based on the neighboring ones at the previous time-step. This algorithmic approach is formalized as the explicit method of finite difference equation, and previous work has used it to study DDMs (Kiani and Shadlen, 2009). By expressing the right side of discretized Equation 7 in the previous time-step, only one term in the equation depends on the current time-step. This allows the PDF at each location to be entirely determined by those at the previous time-step, allowing for a relatively straight-forward recursive implementation. However, the explicit method has a stability criterion (*t* (given a reasonable Δ*x*), making the explicit method prohibitively slow for the purpose of fitting generalized DDM parameters to data (Shinn et al., 2020).

To overcome these limitations of the explicit method, we used the implicit method for Equation 7, expressing the right side of the equation at the current time-step. Specifically, letting *x* = *i*Δ*x* and *t* = *j*Δ*t*, we have:
*x* = *i ^{′}*Δ

*x*and

*t*=

*j*Δ

^{′}*t*, for any integers

*i*and

^{′}*j*. With multiple terms at the current time-step, Equation 8 cannot be directly solved for each

^{′}*i*. However, Equation 8 can be expressed in the matrix form:

*t*=

*j*Δ

*t*, and similarly for

*x*and Δ

*t*(although the numerical error still increases with both Δ

*x*and Δ

*t*, thus constraining the step sizes). This approach allows for much faster computation of generalized DDMs, sufficient for fitting DDM parameters to data (Shinn et al., 2020).

We thus solve generalized DDMs numerically using the implicit method with absorbing boundaries (Butcher, 2008; Shinn et al., 2020), with grid-size Δ*x* = 0.02 and time-step Δ*t* = 0.001 s. This results in a PDF of the decision variable *x* within the boundaries *B* = ± 1, as well as the probability to cross the boundaries, at each time-step. After 2 s of stimulus presentation, total probability densities which crossed the upper or lower bounds are considered probabilities of either choices made. Remaining probability density within the boundaries, which we treat as analogous to indecision trials in the spiking circuit, is split evenly between the two choices to fit 2AFC paradigms; results were similar for whether remaining density was split evenly or proportionally to the total probabilities at each side of 0.

For the standard task, pulse paradigm, and variable duration paradigm, the generalized DDM can compute the PDFs of both choices in each stimuli condition exactly, such that no stochasticity is involved and only one run of the Fokker–Planck calculation is needed. For the psychophysical kernel paradigm, because of the large number of possible stimulus conditions, we simulated 100,000 trials for each generalized DDM.

#### Fitting generalized DDM parameters

All functions are fit using the method of maximum likelihood estimation. The psychometric fit functions (Eqs. 2, 5) are fit by maximizing the average log-likelihood:
*c ^{′}* is spanned over the coherence levels specified for the particular task. From the circuit model or generalized DDM,

*A*or

*B*, respectively, for each coherence level

*c*.

^{′}The generalized DDM model parameters can be fit to the choice and indecision states of neural activity in the standard task, with the maximization function:
*c ^{′}* is spanned over the coherence levels specified for the standard task paradigm. From the spiking circuit simulations,

*A*or

*B*crosses the decision threshold, and

*c*. From the generalized DDM being fit to the circuit data,

^{′}*c*.

^{′}#### Parameter recovery for generalized DDMs

To demonstrate that the proposed paradigms can be applied in behavioral experiments with resulting choice data used to fit the self-coupling parameter of the generalized DDM, we performed parameter recovery analyses. To correspond to behavioral analysis of empirical data collected in an experiment, we used only simulated behavioral choice data, without any additional knowledge of neural states (e.g., indecision trials). We simulated discrete trials in the pulse paradigm using generalized DDMs, then used the resulting psychometric choice data to fit the parameters of a generalized DDM using the Fokker–Planck method.

Specifically, we simulated multiple generalized DDMs with different levels of self-coupling (λ varied from −10 to 10 s^{−1}, with step size 1). For simplicity, we used μ = 14.3 s^{−1} and σ = 1.33 s^{−0.5}. For each generalized DDM, we numerically simulated trials using the Monte Carlo method. To be roughly comparable to experimentally feasible trial numbers, we simulated 1000 trials for each coherence level and pulse onset times. With 11 coherence levels and 20 pulse onset times, this resulted in 220,000 total trials per GDDM.

The simulated choice data are then fit with a generalized DDM, using the implicit Fokker–Planck method. The fitting is performed across coherence and pulse onset times, by minimizing
*c ^{′}* is spanned over the coherence levels specified for each psychophysical tasks, and

*t*

_{on}is spanned over the pulse onset times.

Finally, we repeated the above procedure with a simplified pulse paradigm, with three pulse onset times at 0.5, 1, and 1.5 s. With 1000 simulated trials for each coherence and pulse onset times, this paradigm had 33,000 total trials per GDDM. Of note, the trial conditions (

#### Code availability

The spiking circuit model was implemented using the Python-based Brian neural simulator (Goodman and Brette, 2008), with a simulations time-step of 0.02 ms. The Fokker–Planck solver for the generalized DDM was implemented in custom-written Python code. Simulation code for the model is publicly available: https://github.com/murraylab/decisionmaking-perturbation. Code for Fokker–Planck simulation and fitting of generalized DDMs is available via the PyDDM package (https://pypi.org/project/pyddm/; Shinn et al., 2020).

## Results

We studied decision-making behavior in a well-established spiking circuit model (Wang, 2002). This model, or ones with similar circuit architecture (Martí et al., 2008; Roxin and Ledberg, 2008; Eckhoff et al., 2009), have been used in a large body of computational modeling research on decision-making. The circuit model contains two populations of excitatory neurons, each of which receives an input reflecting evidence for one option (Fig. 1*A*). The circuit can integrate evidence to form a decision through attractor dynamics, which depends on strong recurrent excitatory connections within each group (Wong and Wang, 2006). The two populations also excite a group of inhibitory interneurons, which projects equally to the two excitatory groups and mediates lateral and feedback inhibition. This lateral inhibition can implement winner-take-all competition and categorical decision-making. In particular, stimulus input generally drives the activity of one population to rapidly ramp up, eventually reaching a high-activity decision state. This ramping of activity simultaneously suppresses the other population, allowing for choice readout based on one population crossing an activity-level threshold (Wang, 2002; Wong and Wang, 2006). Because of the inherent stochasticity from Poisson background inputs to all neurons, the model generates probabilistic choices whose proportions vary in a graded manner with net evidence.

### E/I imbalance affects decision-making in a cortical circuit model

To study the effects of E/I balance on decision-making behavior, we simulated hypofunction of NMDARs on excitatory and inhibitory neurons, as the specific mechanism by which to perturb E/I balance (Fig. 1*B*). We selected NMDAR hypofunction as a synaptic perturbation because it is implicated in the pathophysiology of schizophrenia, and NMDAR antagonists such as ketamine are used as pharmacological models of neural and behavioral features observed in schizophrenia (Krystal et al., 1994, 2003; Abi-Saab et al., 1998; Greene, 2001; Kehrer et al., 2008; Lisman et al., 2008; Nakazawa et al., 2012; Weickert et al., 2013). While disruption of E/I balance can be mediated by other synaptic mechanisms, potentially depending on the particular psychiatric disease or pharmacological intervention, the first-order effects of E/I balance disruption are likely to converge (Murray et al., 2014). In particular, we considered three circuit models with different E/I ratios: control; elevated E/I, via NMDAR hypofunction on inhibitory interneurons; and lowered E/I, via NMDAR hypofunction on excitatory pyramidal neurons (Fig. 1*B*). Importantly, all three circuit regimes preserve the stabilities of the low-activity baseline state and high-activity memory states in the absence of stimulus input.

We first characterized choice accuracy of the three spiking circuit models using the psychometric function in the standard task paradigm, in which the stimulus on each trial has a constant coherence throughout a fixed duration. In all three circuit models, the choice accuracy increases monotonically with coherence. We found that both the elevated and lowered E/I circuits yielded comparably impaired performance relative to control (Fig. 2*A*). This impairment can be quantified as a higher discrimination threshold, i.e., the coherence level needed to reach a threshold level of performance. Therefore, choice accuracy in the standard task alone is insufficient to behaviorally dissociate these neurophysiological regimes of elevated versus reduced E/I ratio in the model.

To more systematically examine the circuit's dependence on E/I ratio, we parametrically decreased both the NMDAR conductances on interneurons (*G _{E}*

_{→}

*) and on pyramidal neurons (*

_{I}*G*

_{E}_{→}

*) concurrently, while characterizing decision-making performance (Fig. 2*

_{E}*B*). For the range of relatively weak perturbations tested here, we found that if

*G*

_{E}_{→}

*and*

_{I}*G*

_{E}_{→}

*are reduced proportionally, such that E/I balance is maintained, decision-making performance is approximately unaltered (Fig. 2*

_{E}*B*). However, we note that large parallel reductions in synaptic parameters, beyond the ranges shown in Figure 2

*B*, have been shown to impair decision-making function, insofar as winner-take-all competition depends on having strong recurrent E/I (Wong and Wang, 2006; Murray et al., 2017b).

We explicitly calculated E/I ratio as the ratio of excitatory and inhibitory recurrent synaptic input currents in pyramidal neurons (Fig. 2*B*). We found that E/I ratio provides a concise description of choice accuracy that approximately collapses the two-dimensional space of perturbations onto a one-dimensional curve. Specifically, we found that decision-making performance exhibits an inverted-U dependence on E/I ratio, wherein performance is maximal at an intermediate value of E/I ratio and performance degrades with reduction or elevation of E/I ratio (Fig. 2*C*). We note that the control circuit is slightly offset from the sensitivity peak, as the circuit parameters were not fine-tuned to optimize performance in the standard task. These findings indicate that the relative E/I ratio is a critical effective parameter for decision-making function in the circuit, rather than absolute strengths of particular synaptic connections.

To gain insight into circuit dynamics underlying functional impairment from E/I perturbations, we characterized neural activity during decision-making. Figure 3*A* shows representative single-trial activity traces for the zero-coherence stimulus condition. The circuit exhibits winner-take-all competition and categorical decision-making: the winning population activity (dark colors) ramps up, crossing the decision threshold from which choice is read out and eventually reaching a high-activity attractor state, while the losing population activity (light colors) is suppressed below baseline.

In the elevated-E/I circuit, ramping of activity to the decision threshold is faster relative to control (Fig. 3*A–C*). These circuit dynamics suggest that decision-making performance is impaired because the decision formed earlier on the basis of less net perceptual evidence, without improving the effective signal-to-noise ratio through temporally extended accumulation of evidence. By contrast, the lowered-E/I circuit exhibits slower ramping. As characterized below (Fig. 4), the lowered-E/I circuit also exhibits trials in which neither neuronal population crosses the decision threshold, which we call indecision trials (Fig. 3*A*, gray traces). On such trials, the behavioral response is chosen randomly, which leads to more errors for the lowered-E/I circuit. As with the behavioral measure of discrimination sensitivity (Fig. 2*B*,*C*), these features of circuit dynamics during decision-making followed a dependence on the net E/I ratio as an effective parameter (Fig. 3*D–F*). These circuit dynamics make testable predictions for neural activity in the fixed-duration standard task, that the timing of decision-related neural activity predicting the behavioral response should be shorter or longer under elevated-E/I or lowered-E/I regimes, respectively.

As noted above, the lowered-E/I circuit is more prone to fail to reach a categorical choice state and the behavioral response being selected randomly. We characterized how the proportion of these indecision trials, which are classified by neural activity rather than behavior, depends on coherence and E/I perturbation (Fig. 4*A*,*B*). In the lowered E/I circuit, indecision trials occurred preferentially at low stimulus coherences (Fig. 4*A*), which results in more stochastic choices on low-coherence trials. We can isolate in the models only the trials for which the neural activity does reach a high-activity choice state. Interestingly, if conditioned on these neurally-defined threshold-crossing trials, the lowered-E/I circuit exhibits better psychometric performance than the control circuit (Fig. 4*C*). In this way, the model makes testable predictions regime for neural activity in decision-making circuits with lowered E/I ratio, specifically probabilistic failure of categorical choice-related activity on low-coherence trials (Fig. 4*A*) and higher accuracy on those trials which do exhibit choice-related neural activity (Fig. 4*C*).

Based on these observations, we describe decision-making in the elevated-E/I and lowered-E/I circuits as “impulsive” and “indecisive,” respectively, relative to the control circuit. These findings show that the similarly impaired psychometric performance arises from distinct dynamical circuit mechanisms in the two E/I imbalance regimes. The above results also show that psychometric performance alone in the standard task paradigm is not well suited to disambiguate performance deficits arising from the circuit regimes of elevated versus lowered E/I ratio in the model. Which behavioral tasks are well suited to dissociate between these distinct decision-making impairments? Based on observations of circuit dynamics, we hypothesized that these circuit alterations would make dissociable predictions for the time course of evidence accumulation.

One approach to characterize the time course of evidence accumulation is to use a “reaction time” variant of the standard task, which has no fixed stimuli duration and allows the subject to respond at any time after stimulus onset (Roitman and Shadlen, 2002), thereby potentially revealing dissociable patterns in the distribution of response times (Miller and Katz, 2013). As shown in Figure 3*C*, the decision time, defined by neural activity and reflecting internal choice commitment, is sensitive to E/I balance. Reaction times are typically modeled as a threshold-crossing decision time plus a nondecision time reflecting afferent and efferent delays (Ratcliff and McKoon, 2008). If nondecision times are unchanged by the E/I perturbation, then the model predicts that elevated and lowered E/I ratio will results in faster and slower responses in a reaction time paradigm. A complication in using response times as a diagnostic measure for characterizing decision-making alterations by pharmacology or neuropsychiatric disorders is that core psychomotor functions are broadly altered by pharmacology and neuropsychiatric disorders (Guillermain et al., 2001; Micallef et al., 2002; Taffe et al., 2002; Morrens et al., 2007; Kaiser et al., 2008; Schrijvers et al., 2008; Morsel et al., 2015), which in turn could impact the durations of nondecision times and even their intraindividual variability (Vinogradov et al., 1998). Based on these considerations, a more feasible experimental design in which to study reaction times as a probe of E/I perturbations would be within-subject comparison under an acute causal manipulation of E/I ratio, especially if localized to an association cortical decision-making circuit as could be done via optogenetic or chemogenetic approaches in animals. Such a within-subject design would not be complicated by between-subject variation in nondecision times.

The field of decision-making has devised a number of 2AFC task paradigms which measure the influence of evidence at different time points on the behavioral response. Below, we describe model behavior in three fixed-duration paradigms which are grounded in electrophysiological studies of perceptual decision-making and can characterize the time course of evidence accumulation.

### Psychophysical kernel paradigm

One method to explicitly characterize the time course of evidence accumulation is through the psychophysical kernel task paradigm (Nienborg and Cumming, 2009), which uses random time-varying stimuli to quantify the weight stimulus at a given time point has on behavioral choice (Fig. 5*A*). The psychophysical kernel matrix summarizes the net probability that one choice is selected over the other, across all trials in which the stimulus was at a given coherence level for a given time-step, normalized by the magnitude of the corresponding coherence level (Fig. 5*C–E*). The psychophysical kernel is then obtained by collapsing the psychophysical kernel matrix magnitude across coherence levels (Fig. 5*B*). The psychophysical kernel thus reflects the impact of stimuli to the choice as a function of time: stimuli at time points with a large psychophysical kernel weight have a large influence on choice, whereas stimuli at time points with weight near zero have little impact on choice.

We found that for the control circuit, the psychophysical kernel exhibits a rise-then-decay profile (Fig. 5*B*,*C*), which is qualitatively consistent with what were measured in a number of primate and mouse behavioral experiments during perceptual decision-making (Kiani et al., 2008; Nienborg and Cumming, 2009; Wimmer et al., 2015; Odoemene et al., 2018). This shape of the kernel results from the recurrent dynamics of the circuit during decision formation. As shown in Wong and Wang (2006), when the low-coherence stimulus is presented, the low-activity baseline state disappears and the state of the system flows along a stable manifold toward an unstable saddle point. The initial rise of the psychophysical kernel is because the system because more stimulus-sensitive to input as it moves closer to the saddle point. The decline of the psychophysical kernel reflects bounded accumulation (Wong and Wang, 2006; Kiani et al., 2008): after the state of the circuit diverges toward an attractor state, resulting in decision commitment, the final state is less sensitive to subsequent evidence. The circuit is thus in an “decision state” largely indifferent to late evidence. We note that in these simulations the input kernel on the stimulus does not change over time, i.e., early and late stimulus are represented equally as inputs to the circuit. Instead, the shape of the psychophysical kernel is determined by the decision-making process (Okazawa et al., 2018).

Relative to control, the psychophysical kernel for the elevated-E/I circuit assigns much more weight to early time points and much less weight to late time points (Fig. 5*B*,*D*). For the lowered-E/I circuit, the psychophysical kernel is flattened and has lower weights overall (Fig. 5*B*,*E*); in this regime, the choice is generally less driven by evidence (i.e., the integral of the psychophysical kernel is low). These findings are consistent with characterizations of impulsive versus indecisive decision-making for elevated versus lowered E/I ratio, respectively. In contrast to the standard task paradigm, these two E/I perturbations thereby make dissociable behavioral predictions in the psychophysical kernel task paradigm, and demonstrate how alterations in the temporal weighting of sensory input can be driven by E/I disruptions in decision-making circuits.

### Pulse paradigm

The pulse paradigm is another psychophysical task that measures the impact of stimulus at different times to the committed choice (Huk and Shadlen, 2005; Wong et al., 2007; Kiani et al., 2008). In addition to the standard constant stimulus with fixed duration, a brief pulse of perceptual evidence is added at a variable time within the stimulus presentation (Fig. 6*A*). The influence of the pulse on the decision, as a function of pulse onset time, can be quantified by a horizontal shift in the psychometric function. Kiani et al. (2008) found that this pulse-induced shift decreased at later times.

We found that this pulse-induced shift follows a similar time dependence as the psychophysical kernel. In the control circuit, the magnitude of the shift increased over very early onset times and then decreased later on (Fig. 6*B*,*C*). Relative to control, the elevated-E/I circuit showed a stronger shift for early onset times, but the psychometric function was much less sensitive to pulses at later onset times (Fig. 6*B*,*D*). In contrast, the lowered-E/I circuit showed a flattened pattern, indicating a weaker dependence of stimuli over time in comparison to other circuits (Fig. 6*B*,*E*).

While the psychophysical kernel and pulse paradigms both measure the time course of decision-making and yield consistent results, we note that there are important differences between the two paradigms. In particular, results from the pulse paradigm may be obtained with many fewer trials than the psychophysical kernel paradigm, which requires an extensive number of trials to sufficiently sample stimulus conditions. In addition, the psychophysical kernel paradigm measures the time course of decision-making with weak net stimulus strength whereas the pulse paradigm shift is derived from a wider range of net stimulus strengths.

### Variable duration paradigm

The two paradigms above characterize the time course of evidence accumulation for decision-making, and demonstrate dissociable predictions of elevated and lowered E/I ratio in the circuit model. This suggests that E/I perturbations might also differentially alter decision-making process in relation to stimulus duration. We therefore examined the variable duration paradigm, which was previously used to reveal the bounded nature of evidence accumulation in decision-making (Britten et al., 1992; Watamaniuk and Sekuler, 1992; Wang, 2002; Kiani et al., 2008; Tsetsos et al., 2015). The variable duration paradigm consists of trials with varying stimulus coherence and duration. For each stimulus duration, the psychometric function is calculated to derive the discrimination threshold (Eq. 2). The time-dependent threshold function is defined as the discrimination threshold as a function of stimulus duration.

If evidence accumulation were perfect and unbounded, then increasing the stimulus duration should always reduce the discrimination threshold. Instead, experimental studies show that such decrease in the time-dependent threshold function plateaus at long durations (Britten et al., 1992; Kiani et al., 2008), indicating that late stimuli no longer substantially improve choice accuracy after total evidence reaches a bound. The control circuit model demonstrates a similar plateau (Fig. 7*B*), and is consistent with the observation in the psychophysical kernel and pulse paradigms that later stimuli have less influence on choice (Wang, 2002). We therefore examined how E/I perturbations affect the time-dependent threshold function. The key feature in this characterization is the discrimination threshold plateau value, and the stimulus duration at which the plateau occurs, reflecting the time course of evidence accumulation (Fig. 7*B*,*C*).

Varying the stimulus duration trial-by-trial, we found that the duration of plateau differs substantially across the E/I regimes, in a manner consistent with the other task paradigms examined. In the elevated-E/I circuit, the discrimination threshold plateaus at a higher value relative to the control circuit (Fig. 7*B*,*D*), which indicates reduced choice accuracy at long stimulus durations. The plateau also occurs at an earlier stimulus duration, indicating a shorter integration timescale. Interestingly, the elevated-E/I circuit has a lower threshold than control for very short stimulus durations, because at such short durations the control circuit fails to reach a decision-related attractor state on a greater fraction of trials. The lowered-E/I circuit shows consistently a higher discrimination threshold than control, indicating lower accuracy, which declines without plateauing up to 2-s stimulus durations (Fig. 7*B*,*E*).

### Comparison to upstream sensory coding deficit

In the sections above, we demonstrated three nonstandard task paradigms which, by characterizing the time course of evidence accumulation for decision-making, can behaviorally dissociate distinct decision-making impairments induced by elevated versus lowered E/I ratio. In contrast to dysfunction within a decision-making circuit, as considered above, impairments could also arise from dysfunction in upstream sensory representations. Neuropsychiatric disorders such as schizophrenia have long been associated with deficits in visual perception (Butler et al., 2008; Silverstein, 2016), which could potentially impair perceptual decision-making behavior. In primate studies using random-dot motion decision-making paradigms, inactivation of area MT, which represents momentary motion signals (Britten et al., 1992; Gold and Shadlen, 2007), has been shown to bias decision-making processes as well (Fetsch et al., 2018; Jin and Glickfeld, 2019), demonstrating how upstream sensory coding deficits can contribute to decision-making impairments.

We investigated the effects of sensory coding deficits by weakening the modulation of stimulus inputs by stimulus coherence (Eq. 1; Fig. 8*A*). We found that in the standard task paradigm, an upstream sensory coding deficit can impair performance similarly to E/I perturbations (Fig. 8*B*). However, the nonstandard task paradigms examined above reveal dissociable predictions for these distinct circuit perturbations (Fig. 8*C–E*). Specifically, the primary effect of the upstream sensory coding deficit on these measures is a rescaling while preserving their temporal profiles. This is expected because the upstream coding deficit does not alter the dynamics of the decision-making processes in the circuit, in contrast to the recurrent E/I perturbations.

Of note, in the psychophysical kernel paradigm, the control and upstream-deficit circuits differ in their magnitudes but not time courses, as quantified below (Fig. 8*C*). In principle, it is possible to compare empirically measured kernel weights between groups of subjects. However, drawing conclusions between individual subjects would be complicated by subjects likely differing in their overall sensitivity, which would also scale the magnitude of the kernel. Testing within-subject differences under a causal perturbation would likely be more feasible in this scenario.

We quantified the similarity of time courses via cosine-similarity. For two time courses *w*_{1}(*t*) and *w*_{2}(*t*), the cosine-similarity is defined as *w*. A cosine-similarity of 1 indicates that the two time courses are related by only multiplicative scaling. For the psychophysical kernel (Fig. 8*C*), comparison to the control circuit cosine-similarity values of 0.98 for upstream deficit, 0.79 for elevated-E/I, and 0.86 for reduced-E/I. Similarly, for the psychometric shift in the pulse paradigm (Fig. 8*D*), this yielded similarity values of 0.98 for upstream deficit, 0.85 for elevated-E/I and 0.87 for lowered-E/I. This quantification shows that the shape of the time course is preserved under upstream deficit, and only rescaled for the psychophysical kernel, in contrast to how the shape of the time course is altered by E/I perturbation. Alteration in these time course shapes can be characterized in multiple ways which may be amenable to experimental measurement. For instance, in Figure 8*C*,*D*, we marked the center of mass for each of the curves, which shows a shift to the left by elevated-E/I, a shift to the right for lowered-E/I, and little change for upstream deficit. A feasible experimental design could test two pulse onset times in the pulse paradigm, one early and one late, and measure the ratio of early versus late magnitudes. The model predicts an increase in that ratio for elevated-E/I, a decrease for lowered-E/I, and little change for upstream deficit.

### Generalized DDM with imperfect integration

Thus far, we have characterized a spiking circuit model of the association cortex for decision-making, which can provide dissociable predictions to various disease-motivated deficits in E/I balance and sensory representations. Although biophysically-realistic models can provide insights to the underlying circuit mechanisms of behavioral alterations, algorithmic models defined at the level of psychological processes can provide complementary utility. Such models provide a more parsimonious and abstracted understanding, as well as greater accessibility to directly fit to empirical data. In particular, the DDM is a highly influential and successful framework for modeling 2AFC decision-making (Ratcliff, 1978; Ratcliff and Rouder, 1998; Gold and Shadlen, 2007; Roxin and Ledberg, 2008; Farashahi et al., 2018).

The DDM describes the dynamics of a decision variable that accumulates noisy evidence over time (Fig. 9*A*). The decision variable undergoes a diffusion process biased by evidence-representing drift, until it reaches one of two predefined bounds above or below its starting point. An internal decision of one of the two choices is considered made when the decision variable crosses the corresponding bound. In the standard DDM, the temporal integration process is perfect, in the sense that the decision variable is the time integral of its evidence input, such that all previous evidence inputs will equally contribute to the decision variable before bound crossing. In other words, the memory timescale is infinite in the standard DDM. Past decision-making studies typically assume perfect integration in the DDM, and have focused on canonical parameters such as drift rate, noise strength, and bound heights (Ratcliff, 1978; Ratcliff and Rouder, 1998; Smith and Ratcliff, 2004; Gold and Shadlen, 2007).

Based on the differential E/I effects on the decision-making time course observed in the three paradigms, we hypothesized that in the DDM framework, E/I alterations may correspond to changes in the temporal integration process itself. To capture such effects, we generalized the DDM to include a self-coupling term λ (Fig. 9*A*; Bogacz et al., 2006; Roxin and Ledberg, 2008; Brunton et al., 2013; Miller and Katz, 2013; Farashahi et al., 2018; Shinn et al., 2020). λ = 0 corresponds to the perfect integration of the standard DDM, with an infinite time constant for memory (*B*). Here,

To interpret how imperfect integration shapes the time course of evidence accumulation, it is important to distinguish between two related but distinct processes with timescales: the integration time constant (*F*) which lengthens the time window in which evidence can impact the choice. Our selection of the self-coupling term λ as a key parameter to introduce into a generalized DDM and fit to capture effects of E/I alterations was motivated by the hypothesis of an altered integration process, as integration in the circuit model is because of excitatory-inhibitory recurrent synaptic interactions (Wang, 2002; Wong and Wang, 2006). Future computational modeling studies can test how alterations to various biophysical parameters in the spiking circuit can best be mapped onto parameters of generalized DDMs.

### Fitting generalized DDMs

While semi-analytical solutions exist for the standard DDM (i.e., with λ = 0) and allows efficient model fitting, the same typically does not hold for generalized DDMs with extensions such as self-coupling as considered here. Instead, generalized DDMs can be solved and fit using the Fokker–Planck equation, which describes the probability distribution of the decision variable as it evolves over time (Fig. 9*D*). The net flux across each bound is considered probability committed to the corresponding choice. Practically, this requires discretization of the decision variable space and time to small steps, and propagating the Fokker–Planck equation over these small steps. Across the literature of neuroscience, the forward-Euler method was commonly used, because of its intuitive and straight-forward implementation. However, this method required very small steps in time for stability in the PDF evolution, and can be prohibitively slow for the fitting models to data. To overcome this technical challenge, we calculated the numerical Fokker–Planck equation for the generalized DDM with the backward-Euler method, which has greater computational performance than forward-Euler, thereby enabling data-fitting of generalized DDMs (Shinn et al., 2020).

To test model hypotheses that effects of E/I imbalance can be captured by changes in imperfect integration of a DDM, we fit generalized DDMs with self-coupling to the circuit models using neurally measured choice and indecision states in the standard task, and then examined the behavior of those fit DDMs on the nonstandard tasks of psychophysical kernel, pulse, and duration paradigms. As shown in results above (Fig. 6), psychometric performance in the standard task does not effectively dissociate between model regimes. For this reason, we used the psychometric performance, combined with the neurally-defined indecision trial proportions, to fit the circuit models, as E/I perturbations affect both of these model features and dissociate elevated-E/I and lowered-E/I regimes.

We first fit a DDM without self-coupling (λ = 0) model to the control circuit (Fig. 9*C*,*D*). As a standard DDM without self-coupling is the dominant DDM modeling framework in the field, this shows that performance of the control circuit can be approximated by the standard DDM. This fitting set the other two DDM parameters of drift rate μ = 14.3 s^{−1} and noise strength σ = 1.33 s^{−0.5}. We then fit the self-coupling parameter λ to each two disrupted E/I conditions. In line with our hypotheses, fitting λ to the elevated-E/I circuit yielded an unstable integrator with a large positive value (*C*,*E*). On the other hand, fitting λ to the lowered-E/I circuit yielded a leaky integrator with a large negative value (*C*,*F*). We note that with μ and σ fixed, the fit self-coupling term of the control circuit is small (λ_{control} = 0.0130). The fit self-coupling terms of the E/I-perturbed circuits correspond to relatively short integration time constants (

The above fitting analysis constrained the control circuit to be fit by perfect integrator λ = 0 to focus on deviations in self-coupling by the E/I perturbations. For validation, we simultaneously fit the three circuits together, without the constraint of λ = 0 for the control circuit. In this fitting procedure, each of the three circuits has its own independent λ and they share μ and σ. This fitting yielded self-coupling parameters ^{− 1/2}. This fitting procedure therefore yields the same major qualitative results as in the prior analysis with unstable and leaky integration for the elevated and lowered E/I circuits, respectively. The control circuit is fit by positive self-coupling which is much weaker than the that of the elevated-E/I or lowered-E/I circuits, indicating weakly unstable integrator, which is consistent with being offset from the sensitivity peak in Figure 2*C*. The corresponding integration timescale (_{control} = 850 ms) than either perturbed circuits (τ_{elevated} = 158 ms, τ_{lowered} = 217 ms).

Next, we tested these fitted generalized DDMs on the three paradigms that probe the time course of evidence accumulation (Fig. 10). We found that these two regimes of self-coupling could well capture key aspects of how E/I perturbations impact decision-making behavior across all paradigms. The unstable integrator, fit to the elevated E/I circuit, over-emphasizes early evidence and discounts late evidence in both the psychophysical kernel and pulse paradigms. By contrast, for the leaky integrator, fit to the lowered E/I circuit, those curves are flat and low indicating less sensitivity to evidence in general. One notable difference from the spiking circuit results is that the generalized DDM is sensitive to stimulus immediately following stimulus onset, unlike the circuit model which has an initial ramp-up in sensitivity (Wong and Wang, 2006; Roxin and Ledberg, 2008; Wimmer et al., 2015). This is because the circuit model which takes time to reach an integrative state through its quasi-two-dimensional nonlinear dynamics (see above, Psychophysical kernel paradigm), whereas the one-dimensional generalized DDM can begin evidence accumulation from the beginning. In principle, this initial rise could be captured by a further generalization of DDM, for instance to include a temporal aspect of gain modulation. However, we consider this feature to be relatively incidental and not predictive of the alterations of E/I balance.

Finally, we examined how a generalized DDM with self-coupling could be fit to behavioral choice data from a task paradigm that enables characterizing the time course of evidence accumulation. Specifically, we tested parameter recovery of the self-coupling term λ using only choice behavior in the pulse paradigm (Fig. 10). We varied λ in a generalized DDM, spanning from leaky (negative λ) to unstable (positive λ) integration regimes. For each value of λ, we simulated probabilistic choices in discrete sets of trials under different pulse onset times and coherences. We then fit all three of the generalized DDM parameters (λ, μ, σ) to the choice behavior for each circuit, using the Fokker–Planck method (see Materials and Methods). We found that λ could be recovered with high accuracy, when tested with 20 pulse onset times as used in Figure 6 (SD = 0.41; Fig. 11*A*), as well as when tested using only three pulse onset times (SD = 1.51; Fig. 11*B*). These results demonstrate the feasibility of fitting generalized DDMs to choice behavior from appropriate task paradigms, to test model predictions for alterations in the temporal integration regime (leaky vs quasi-perfect vs unstable) of evidence accumulation in perceptual decision-making.

Overall, we found general agreement in decision-making behavior between the spiking circuits and the generalized DDMs with self-coupling fit the circuit model. The parsimony of the generalized DDM, relative to the circuit model, is notable, as it is only a one-dimensional dynamical process regulated by few parameters. Parameter recovery of self-coupling in generalized DDMs, when fit to choice behavior in the pulse paradigm, suggest the potential utility of such task paradigms and fitting with generalized DDMs to give insight into decision-making deficits specific to experimental manipulations or neuropsychiatric disorders. Furthermore, the expanded flexibility of generalized DDMs allows them to translate insights from biophysically-based spiking circuit models across levels of analysis.

## Discussion

We examined alterations of E/I balance in a biophysically-based cortical circuit model of decision-making. Both elevating and lowering E/I ratio can degrade decision-making performance, following an inverted-U dependence. Elevated E/I ratio results in impulsive decision-making through under-utilization of later available evidence; lowered E/I ratio results in indecisive decision-making through weakened evidence accumulation to decision commitment. Several task paradigms, psychophysical kernel, pulse, and variable duration, dissociate these regimes by characterizing the time course of evidence accumulation. Generalized DDMs capture elevation and reduction of E/I ratio through more unstable and more leaky integration, respectively. Generalized DDMs provide theoretical insight and facilitate fitting models to behavioral data.

The model makes dissociable and testable predictions for decision-making behavior arising from distinct E/I perturbations. In animal models, optogenetic and chemogenetic methods enable manipulation of E/I ratio with regional and cell-type specificity (Yizhar et al., 2011; Markicevic et al., 2020), which could test model predictions for neural activity and behavior in fixed-duration or response-time paradigms. In humans, perceptual decision-making behavior can be measured during pharmacological perturbations of cortical E/I ratio. For instance, Carter et al. (2004) found that psilocybin impairs perceptual discrimination with random-dot motion stimuli. NMDAR antagonists, such as ketamine, are of interest because NMDAR hypofunction is implicated in the neuropathology and symptomatology of schizophrenia (Krystal et al., 1994; Lahti, 1995). Subanesthetic ketamine administration impacts various cognitive functions in humans and monkeys (Skoblenick and Everling, 2012; Blackman et al., 2013; Ma et al., 2015).

Neuromodulators, including dopamine and norepinephrine, could regulate decision-making processes through modulation of cortical E/I ratio (Aston-Jones and Cohen, 2005; McGinley et al., 2015; Pfeffer et al., 2018), to adjust cognitive behavior according to task demands (Ueltzhöffer et al., 2015; Urai et al., 2017). Pupil dilation can reflect noradrenergic modulation of arousal (Murphy et al., 2014; Reimer et al., 2016). In line with our model predictions for elevated E/I ratio, Keung et al. (2018) found that larger pupil dilation corresponded with greater weighting of early evidence in psychophysical kernels. Prior computational modeling studies have linked E/I effects of neuromodulation to cognitive deficits (Durstewitz and Seamans, 2008; Eckhoff et al., 2009; Cano-Colino et al., 2014). Our study focuses on how E/I regulation by neuromodulators may impact evidence accumulation dynamics, providing testable predictions for behavioral and neural experiments.

Random-dot motion discrimination tasks, using a standard paradigm, have revealed impaired perceptual discrimination (i.e., higher discrimination threshold) in schizophrenia (Chen et al., 2003, 2004, 2005) and autism spectrum disorder (Milne et al., 2002; Koldewyn et al., 2010), both of which are associated with disrupted cortical E/I balance (Lisman et al., 2008; Marin, 2012; Nakazawa et al., 2012; Lee et al., 2016). Prior studies primarily interpreted those impairments as reflecting dysfunction in early visual cortex, such as area MT (Butler et al., 2008). Our model suggests a potential complementary impairment in downstream association cortical decision-making circuits. Task paradigms which can dissociate different forms of impairments have potential to reveal concurrent alterations in sensory and cognitive computations in neuropsychiatric disorders. The model's neurophysiological basis provides interpretation of dissociable behavioral regimes in terms of hypothesized deficits in cortical E/I balance.

We focused on three fixed-duration task paradigms which probe the time course of evidence accumulation. Our findings show that even when the stimulus input filter is constant in time, the stimulus's impact on choice can follow a time-varying profile that is shaped by the internal dynamics of decision formation in the circuit (Wimmer et al., 2015), and is thereby shaped by E/I ratio. In line with these findings, recent studies show that observed temporal weights, in tasks similar to our psychophysical kernel paradigm, do not in general reflect only the stimulus's “sensory filter,” but instead reflects both sensory and decision-making processes (Kiani et al., 2008; Yates et al., 2017; Okazawa et al., 2018; Stine et al., 2020). Although the fixed-duration task paradigms considered here might not be best-suited to characterize a nonconstant sensory filter, they are useful to probe decision-making processes across experimental conditions (Sheppard et al., 2013; Scott et al., 2015; Kawaguchi et al., 2018), especially in the context of perturbations (Erlich et al., 2015; Katz et al., 2016).

Depending on task demands, it may be functionally beneficial to modulate the integration timescale of decision-making to match stimulus temporal statistics. For example, older information might become outdated and can be discounted via leaky integration. Recent studies show how flexible adaptation can be implemented to optimize decision-making performance in dynamical environments (Ossmy et al., 2013; Glaze et al., 2015; Veliz-Cuba et al., 2016; Farashahi et al., 2018; Levi et al., 2018). Our results suggest that impairment in the dynamical range of modulating E/I ratio could lead to a cognitive deficit of reduced flexibility in tuning integration timescales.

Our study demonstrates feasibility and utility of relating classes of computational models that operate at different levels of abstraction, as here by linking biophysically-based spiking circuits and generalized DDMs with imperfect integration. Incorporating a self-coupling term into generalized DDMs can capture behavioral effects of imperfect integration (Bogacz et al., 2006; Roxin and Ledberg, 2008; Miller and Katz, 2013; Farashahi et al., 2018). Biophysically-based circuit models are highly computationally intensive, which limits their practical application to fit empirical behavioral data in cognitive tasks. In contrast, the generalized DDM is computationally tractable when simulated using the Fokker–Planck formalism, which enables efficient fitting to empirical and model-generated psychometric choice data (Shinn et al., 2020). A potentially fruitful strategy may be to fit generalized DDMs to empirical psychophysical data, using sensitive task paradigms, as well as to circuit models to link to neurophysiological hypotheses.

An underlying hypothesis arising from our model architecture is that on trials when the decision circuit fails to reach a categorical choice state, a downstream motor circuit produces a random behavioral response. This architecture, following prior modeling (Wang, 2002), makes neurophysiologically testable predictions. The separation between decision and motor circuits in our model is analogous to distinct roles of neurophysiological cell types observed in the frontal eye field (FEF) during perceptual decision-making tasks, specifically the correspondence between the decision circuit in our model with FEF visual neurons and downstream motor response in our model with FEF motor neurons (Schall, 2015). In typical decision-making tasks, FEF visual neurons produce categorical decision-related activity in separation of firing rates for selected versus nonselected targets, after which FEF motor neurons ramp to trigger an associated saccadic response (Woodman et al., 2008; Purcell et al., 2010). In tasks with timing pressure and limited sensory evidence, motor neurons can trigger a response without decision-related selection in visual neurons (Stanford et al., 2010; Costello et al., 2013). This decision-to-motor architecture has been instantiated in prior circuit models separating decision and motor sub-circuits (Soltani et al., 2013; Murray et al., 2017a). Random response generation on indecision trials can be implemented through nonselective excitatory drive to the downstream motor circuit at the offset of stimulus presentation, carrying an urgency-related signal (Wong and Wang, 2006). Future experimental and computational studies can further examine these model hypotheses for decision-related and motor-related activity under E/I manipulations.

The circuit model in this study produce decision-making behavior with integration of evidence through ramping neural dynamics (Wang, 2002; Wong and Wang, 2006). Other dynamical regimes may be used by different modes of decision-making. For instance, decision-making may be implemented, in the same cortical circuit architecture as our model, through stochastic “jumping” between multiple quasi-stable attractor states (Martí et al., 2008; Miller and Katz, 2010, 2013). A related dynamical regime is one in which large fluctuations drive frequent transitions between decision-related attractor states (Albantakis and Deco, 2011; Prat-Ortega et al., 2021). Recent experiments by Najafi et al. (2020) demonstrated choice-tuned activity in inhibitory interneurons, which may support continuous attractor dynamics as another dynamical regime for decision circuits (Lim and Goldman, 2013). Inagaki et al. (2019) found that decision-making activity in frontal cortex resemble discrete attractor dynamics rather than continuous attractor dynamics. It remains to be studied how predictions from the present model framework generalize to other dynamical regimes.

The model predictions are based on a circuit model that is biophysically grounded yet parsimonious (Wang, 2002). Multiple model extensions are possible to investigate specific neurophysiological hypotheses. Effects of neuromodulators can be studied in these circuit models (Durstewitz et al., 2000; Eckhoff et al., 2009; Cano-Colino et al., 2014). Another future modeling direction is to examine these phenomena in more distributed circuit models, e.g., with multiple brain regions and cortical layers (Soltani et al., 2013; Mejias et al., 2016; Murray et al., 2017b). Finally, the current study considers recurrent interactions with only two cell types. Distinct classes of inhibitory interneurons exhibit diverse cellular and synaptic properties, microcircuit connectivity motifs, and neurophysiological responses (Rudy et al., 2011; Kepecs and Fishell, 2014; Jiang et al., 2015). Incorporating distinct inhibitory interneuron classes into circuit models could allow finer-grained investigation of inhibitory dysfunction beyond the relatively coarse net effect of E/I ratio (Wang et al., 2004; Yang et al., 2016; O'Donnell et al., 2017).

## Footnotes

This work was supported by National Institutes of Health Grants R01MH112746 (to J.D.M.), TL1TR000141 (to J.D.M.), DP5OD012109 (to A.A.), and R01MH062349 (to X.-J.W.); Simons Foundation Autism Research Initiative (SFARI) Pilot Award (J.D.M., A.A.); and the Brazilian National Council for Scientific and Technological Development (CNPq) Science Without Borders Grant 202853/2014 (to T.B.). We thank Qinglong Gu for help with model fitting.

The authors declare no competing financial interests.

- Correspondence should be addressed to John D. Murray at john.murray{at}yale.edu