## Abstract

Perceptual decision-making is the subject of many experimental and theoretical studies. Most modeling analyses are based on statistical processes of accumulation of evidence. In contrast, very few works confront attractor network models' predictions with empirical data from continuous sequences of trials. Recently however, numerical simulations of a biophysical competitive attractor network model have shown that such a network can describe sequences of decision trials and reproduce repetition biases observed in perceptual decision experiments. Here we get more insights into such effects by considering an extension of the reduced attractor network model of Wong and Wang (2006), taking into account an inhibitory current delivered to the network once a decision has been made. We make explicit the conditions on this inhibitory input for which the network can perform a succession of trials, without being either trapped in the first reached attractor, or losing all memory of the past dynamics. We study in detail how, during a sequence of decision trials, reaction times and performance depend on nonlinear dynamics of the network, and we confront the model behavior with empirical findings on sequential effects. Here we show that, quite remarkably, the network exhibits, qualitatively and with the correct order of magnitude, post-error slowing and post-error improvement in accuracy, two subtle effects reported in behavioral experiments in the absence of any feedback about the correctness of the decision. Our work thus provides evidence that such effects result from intrinsic properties of the nonlinear neural dynamics.

**SIGNIFICANCE STATEMENT** Much experimental and theoretical work is being devoted to the understanding of the neural correlates of perceptual decision-making. In a typical behavioral experiment, animals or humans perform a continuous series of binary discrimination tasks. To model such experiments, we consider a biophysical decision-making attractor neural network, taking into account an inhibitory current delivered to the network once a decision is made. Here we provide evidence that the same intrinsic properties of the nonlinear network dynamics underpins various sequential effects reported in experiments. Quite remarkably, in the absence of feedback on the correctness of the decisions, the network exhibits post-error slowing (longer reaction times after error trials) and post-error improvement in accuracy (smaller error rates after error trials).

## Introduction

Typical experiments on perceptual decision-making consist of series of successive trials separated by a short time interval, in which performance in identification and reaction times are measured. The most studied protocol is the one of two-alternative forced-choice (TAFC) task (Ratcliff, 1978; Laming, 1979a; Vickers, 1979; Townsend and Ashby, 1983; Busemeyer and Townsend, 1993; Shadlen and Newsome, 1996); Usher and McClelland, 2001; Ratcliff and Smith, 2004). Several studies have demonstrated strong serial dependence in perceptual decisions between temporally close stimuli (Fecteau and Munoz, 2003; Jentzsch and Dudschig, 2009; Danielmeier and Ullsperger, 2011). Such effects have been studied in the framework of statistical models of accumulation of evidence (Dutilh et al., 2012), the most common theoretical approach to perceptual decision-making (Ratcliff, 1978; Ashby, 1983; Bogacz et al., 2006; Shadlen et al., 2006; Ratcliff and McKoon, 2008) or with a more complex attractor network with additional memory units specifically implementing a biasing mechanism (Gao et al., 2009).

Wang (2002) proposed an alternative approach to the modeling of perceptual decision-making based on a biophysical cortical network model of leaky integrate-and-fire neurons. The model is shown to account for random-dot experimental results from Shadlen and Newsome (2001) and Roitman and Shadlen (2002). This decision-making attractor network was first studied in the context of a task requiring to keep in memory the last decision. This working memory effect is precisely achieved by having the network activity trapped into an attractor state. However, in the context of consecutive trials, the neural activities have to be reset in a low activity state before the onset of the next stimulus. Bonaiuto et al. (2016) have considered a parameter range of weaker excitation where the working memory phase cannot be maintained. The main result is that the performance of the network is biased toward the previous decision, an effect which decreases with the intertrial time. Because of the slow relaxation dynamics in the model, we only study intertrial times >1.5 s. However, sequential effects have been reported for shorter intertrial times, such as 500 ms by Laming (1979a) and Danielmeier and Ullsperger (2011). Instead of decreasing the recurrent excitation, an alternative is to introduce an additional inhibitory input following a decision (Lo and Wang, 2006; Engel et al., 2015; Bliss and D'Esposito, 2017). Lo and Wang (2006) have proposed such a mechanism to account for the control of the decision threshold.

The purpose of the present paper is to revisit this issue of dealing with sequences of successive trials within the framework of attractor networks with a focus on intertrial times as short as 500 ms. We do so by taking advantage of the reduced model of Wong and Wang (2006) which is amenable to mathematical analysis. This model consists of a network of two units, representing the pool activities of two populations of cells, each one being specific to one of the two stimulus categories. Wong and Wang (2006) derive the equations of the reduced model and choose the parameter values to preserve as much as possible the dynamical and behavioral properties of the original model. In line with Lo and Wang (2006), we take into account an inhibitory current originating from the basal ganglia, occurring once a decision has been made. We explore how the network nonlinear dynamics leads to serial dependence effects in TAFC tasks, and compare with empirical findings such as sequential bias in decisions (Cho et al., 2002) or post-error adjustments (Danielmeier and Ullsperger, 2011; Danielmeier et al., 2011). Our main finding is that the model reproduces two main post-error adjustments observed in the absence of feedback on the correctness of the decision: post-error slowing (PES) and post-error improvement in accuracy (PIA), with PES consisting of longer reaction times, and PIA of smaller error rates, for trials following a trial with an incorrect decision. We thus provide evidence that such effects result from nonlinearities in the neural dynamics.

## Materials and Methods

We are interested in modeling experiments where a subject has to decide whether a stimulus belongs to one or the other of two categories, hereafter denoted *L* and *R*. A particular example is the one of random-dot experiments (Shadlen and Newsome, 2001; Roitman and Shadlen, 2002), where a monkey performs a motion discrimination task in which it has to decide whether a motion direction, embedded into a random-dot motion, is toward left (*L*) or right (*R*). The general case is the one of categorical perception experiments, in which one can control the degree of ambiguity of the stimuli; e.g., psycholinguistic experiments with stimuli interpolating between two phonemes (Liberman et al., 1957), visual categorization experiments with continuous morphs from cats to dogs (Freedman et al., 2003), etc. We focus on TAFC protocols in which no feedback is given on the correctness of the decisions.

We consider a decision-making recurrent network of spiking neurons governed by local excitation and feedback inhibition, as introduced and studied by Compte et al. (2000) and Wang (2002). Because mathematical analysis is harder to be performed for such complex networks, without a high level of abstraction (Miller and Katz, 2013), one must rely on simulations which, themselves, can be computationally heavy. For our analysis, we make use of the reduced firing-rate model of Wong and Wang (2006) obtained by a systematic reduction of the detailed biophysical attractor network model. The reduction aimed at faithfully reproducing not only the behavioral behavior of the full model, but also neural firing rate dynamics and the output synaptic gating variables. This is done within a mean-field approach, with calibrated simplified *F*/*I* curves for the neural units, and in the limit of slow NMDA gating variables motivated by neurophysiological data. The full details were given by Wong and Wang (2006; their main text and supplemental Information).

Because this model has been built to reproduce as faithfully as possible the neural activity of the full spiking neural network, it can be used as a proxy for simulating the full spiking network (Engel and Wang, 2011; Deco et al., 2013; Engel et al., 2015). Here, we mainly make use of this model to gain better insights into the understanding of the model behavior. In particular, one can conveniently represent the network dynamics in a 2 d phase plane and rigorously analyze the network dynamics (Wong and Wang, 2006).

#### A reduced recurrent network model for decision-making

We first present the architecture without the corollary discharge (Wong and Wang (2006); Fig. 1*A*), which consists of two competing units, each one representing an excitatory neuronal pool, selective to one of the two categories, *L* or *R*. The two units inhibit one another, while they are subject to self-excitation. The dynamics is described by a set of coupled equations for the synaptic activities *S _{L}* and

*S*of the two units

_{R}*L*and

*R*: The synaptic drive

*S*for pool

_{i}*i*∈ {

*L*,

*R*} corresponds to the fraction of activated NMDA conductance, and

*I*

_{i}_{,tot}is the total synaptic input current to unit

*i*. The function

*f*is the effective single-cell input/output relation (Abbott and Chance, 2005), giving the firing rate as a function of the input current: where

*a*,

*b*,

*d*are parameters whose values are obtained through numerical fit.

The total synaptic input currents, taking into account the inhibition between populations, the self-excitation, the background current and the stimulus-selective current can be written as follows:
with *J _{i}*

_{,}

*the synaptic couplings (*

_{j}*i*and

*j*being

*L*or

*R*). The minus signs in the equations make explicit the fact that the inter-unit connections are inhibitory (the synaptic parameters

*J*

_{i}_{,}

*being thus positive or null). The term*

_{j}*I*

_{stim,}

*is the stimulus-selective external input. If μ*

_{i}_{0}denotes the strength of the signal, the form of this stimulus-selective current is as follows: The sign, ±, is positive when the stimulus favors population

*L*, negative in the other case. The quantity

*c*, between 0 and 1, gives the strength of the signal bias. It quantifies the coherence level of the stimulus. For example, in the random-dot motion framework, it corresponds to the fraction of dots contributing to the coherent motion. In the following, we will give this coherence level in percentage. Following Wang (2002), this input forms the pooling of the activities of middle temporal neurons firing according to their preferred directions. This input current is only present during the presentation of the stimulus and is shut down once the decision is made.

In the present model, in line with a large literature modeling decision-making, the input, Equation 5, is thus reduced to a signal parametrized by a scalar quantifying the coherence or degree of ambiguity of the stimulus. More global approaches consider the explicit coupling between the encoding and the decision neural populations, with a population of stimulus-specific cells for the coding layer (Beck et al., 2008; Bonnasse-Gahot and Nadal, 2012; Engel et al., 2015). We believe that the main results presented here would not be affected by extending the model to take into account the coding stage, but we leave such study for further work.

In addition to the stimulus-selective part, each unit receives individually an extra noisy input, fluctuating around the mean effective external input *I*_{0}:
with τ_{noise}, a synaptic time constant that filters the (uncorrelated) white noises, η* _{i}*(

*t*),

*i*=

*L*,

*R*. For the simulations, unless otherwise stated, parameter values will be those shown in Table 1.

Initially the system is in a symmetric (or neutral) attractor state, with low firing rates and synaptic activities (Fig. 1*B*). On the presentation of the stimulus, the system evolves toward one of the two attractor states, corresponding to the decision state. In these attractors, the “winning” unit fires at a higher rate than the other. We are interested in reaction time experiments. In our simulations, we consider that the system has made a decision when for the first time the firing rate of one of the two units crosses a threshold θ, fixed here at 20 Hz. We have chosen this parameter value, slightly different from the one by Wong and Wang (2006), from the calibration of the extended model discussed below on sequential decision trials with short response–stimulus intervals (RSIs). We have checked that this does not affect the psychometric function of the network, the accuracy is unchanged and the reaction time is shifted by a constant.

#### Extended reduced model: inhibitory corollary discharge

Studies (Roitman and Shadlen, 2002; Ganguli et al., 2008) show that, during decision tasks, neurons' activity experiences a rapid decay following the responses (Roitman and Shadlen, 2002, their Figs. 7 and 9). Simulations of the above model show that even when the stimulus is withdrawn at the time of decision, the decrease in activity is not sufficiently strong to account for these empirical findings. Decreasing the recurrent excitatory weights does allow for a stronger decrease in activity, as shown by Bonaiuto et al. (2016). However, both the increase and the decay of activities are too slow, and the network cannot perform sequential decisions with RSIs <1 s. Hence the decrease in activity requires an inhibitory input at the time of the decision.

Such inhibitory mechanism has been proposed to originate from the superior colliculus (SC), controlling saccadic eye movements, and the basal ganglia-thalamic circuit, which plays a fundamental role in many cognitive functions including perceptual decision-making. Indeed, the burst neurons of the SC receive inputs from the parietal cortex and project to midbrain neurons responsible for the generation of saccadic eye movements (Scudder et al., 2002; Hall and Moschovakis, 2003). Thus the threshold crossing of the cortical neural activity is believed to be detected by the SC (Saito and Isa, 2003). In turn, the SC projects feedback connections on cortical neurons (Crapse and Sommer, 2009). At the time of a saccade, SC neurons emit a corollary discharge (CD) through these feedback connections (Sommer and Wurtz, 2008). The impact of this CD as an inhibition has been discussed in various contexts (Crapse and Sommer, 2008; Sommer and Wurtz, 2008; Yang et al., 2008). The generation of a corollary discharge resulting in an inhibitory input has been proposed and discussed in several modeling works, in the case of the modulation of the decision threshold in reaction time tasks (Lo and Wang, 2006), in the context of learning (Engel et al., 2015), and in a ring model of visual working memory (Bliss and D'Esposito, 2017).

We note here that, for simplicity and in accordance with the existing literature (Lo and Wang, 2006; Engel et al., 2015; Bliss and D'Esposito, 2017), we will be referring to the inhibitory current resulting from the corollary discharge as the *corollary discharge*.

In the context of attractor networks for decision tasks, Lo and Wang (2006) introduce an extension of the biophysical model of Wang (2002) that consists of modeling the coupling between the network, the basal ganglia, and the SC. The net effect is an inhibition onto the populations in charge of making the decision. Although Lo and Wang (2006) address the issue of the control of the decision threshold, they do not discuss the relaxation dynamics induced by the CD, nor the effects on sequential decision tasks outside a learning context (Hsiao and Lo, 2013).

To analyze these effects with the reduced attractor network model, we assume that, after crossing the threshold, the network receives an inhibitory current, mimicking the joint effect of basal ganglia and SC on the two neural populations (Fig. 2*A*).

In the case of Engel et al. (2015), the function of the CD is to reset the neural activity to allow the network to learn during the next trial. For this, the form of the CD input is chosen as a constant inhibitory current for a duration of 300 ms. However, such strong input leads to an abrupt reset to the neural state with no memory of the previous trial. We thus rather consider here a smooth version of this discharge, considering that the resulting inhibitory input has a standard exponential form (Finkel and Redman, 1983). The inhibitory input, *I*_{CD}(*t*), is then given by the following:
The relaxation time constant τ_{CD} is chosen in the biological range of synaptic relaxation times and in accordance with the relaxation-time range of the network dynamics, τ_{CD} = 200 ms (Fig. 2*B*; see Discussion).

Therefore the input currents are modified as follows:
We can now study the dynamics of this system in a sequence of decision trials (protocol illustrated in Fig. 2*C*). We address here two issues: first, is there a parameter regime for which the network can engage in a series of trials; that is, for which the state of the dynamical system, at the end of the relaxation period (end of the RSI), is close to the neutral state (instead of being trapped in the attractor reached at the first trial); second, is there a domain within this parameter regime for which one expects to see sequential effects (instead of a complete loss of the memory of the previous decision state).

In Figure 3 we illustrate the network dynamics between two consecutive stimuli during a sequence of trials, comparing the cases with and without the CD. In the absence of the CD input, the network is not able to make a new decision different from the previous one (Fig. 3*A*). Even when the opposite stimulus is presented, the system cannot leave the attractor previously reached, unless in the presence of an unrealistic strong input bias. If however the strength *I*_{CD,max} is strong enough, the CD makes the system to escape from the previous attractor and to relax toward near the neutral resting state with low firing rates. If too strong, or in case of a too long RSI, at the onset of the next stimulus the neutral state has been reached and memory of past trials is lost. For an intermediate range of parameters, at the onset of the next stimulus the system has escaped from the attractor but is still on a trajectory dependent on the previous trial (Fig. 3*B*).

We have computed the time constant τ of the network during relaxation (during the RSI), with respect to the CD amplitude, *I*_{CD,max} (Fig. 2*B*). This computation is done for a CD with a constant amplitude, *I*_{CD}(*t*) = *I*_{CD,max}. One sees that, for *I*_{CD,max} of order 0.03 ∼ 0.04 nA, the network time constant τ is four to five times smaller than the duration of the RSI. We choose the relaxation constant τ_{CD} of the CD of the same order of magnitude (as in the above simulation where τ_{CD} = 200 ms). With such value, at the onset of the next stimulus, the network state will still be far enough from the symmetric attractor, so that we can expect to observe sequential effects, as confirmed by the analysis in Results.

With the inhibitory CD, after the threshold is crossed by one of the two neural populations, there is a big drop in the neuronal activity (Fig. 3*B*), corresponding to the exit from the previous attractor state. This type of time course is in agreement with the experimental findings of Roitman and Shadlen (2002) and Ganguli et al. (2008), who measure the activity of LIP neurons during a decision task. They show that neurons that accumulate evidence during decision tasks experience rapid decay, or inhibitory suppression, of activity following responses, similar to Figure 3*B* (but see Lo and Wang, 2006 for a related modeling study with spiking neurons, or Gao et al., 2009 for rapid decay of neural activity with another type of attractor network).

We now derive the conditions on *I*_{CD} under which the network is able to make a sequence of trials. To this end, we analyze the dynamics after a decision has been made, during the RSI (hence during the period with no external excitatory inputs). The results are illustrated in Figure 4 on which we represent a sketch of the phase plane dynamics and a bifurcation diagram.

Consider first what would happen under a scenario of a constant, time independent, inhibitory input during all the RSI (Fig. 4*A–D*; formally, this corresponded to setting τ_{CD} = + ∞ in Eq. 7). At small values of the inhibitory current, the attractor landscape is qualitatively the same as in the absence of inhibitory current: in the absence of noise there is three fixed points, one associated with each category and the neutral one (Fig. 3*B*). At some critical value, of ∼0.0215 nA, there is a bifurcation (Fig. 4*D*); for larger values of the inhibitory current, only one fixed point remains, the neutral one (Fig. 4*D*). As a result, applying a constant CD would either have no effect on the attractor landscape, (current amplitude below the critical value), so that the dynamics remains within the basin of attraction of the attractor reaches at the previous trial; or would reset the activity at the neutral state (current amplitude above the critical value), losing all memory of the previous decision.

Now in the case of a CD with a value decreasing with time (Fig. 4*E–H*, scenario of an exponential decay), the network behavior will depend on where the dynamics lies at the time of the onset of the next stimulus. The dynamics, starting from a decision state (Fig. 4*F*,*G*, near the blue attractor), is more easily understood by considering the limit of slow relaxation (large time constant τ_{CD}). Between times *t* and *t* + Δ*t*, with Δ*t* small compared with τ, the dynamics is similar to what it would be with a constant CD with amplitude *I*_{CD}(*t*). Hence if *I*_{CD}(*t*) is larger than the critical value discussed above, the dynamics “sees” a unique attractor, the neutral state, and is driven toward it. When *I*_{CD}(*t*) becomes smaller than the critical value, the system sees again three attractors, and finds itself within the basin of attraction of either the initial fixed point (corresponding to the previous decision; Fig. 4*F*), or of the neutral fixed point (Fig. 4*G*). In the latter case, the network is able to engage in a new decision task.

To conclude, to have the network performing sequential decision tasks, one needs *I*_{CD,max} to be larger than the critical value (∼*I*_{CD} = 0.0215 nA; Fig. 4*H*), and, for a given value of *I*_{CD,max}, to have a time constant τ_{CD} large enough compared with the RSI for the system to relax close enough to the neutral attractor at the onset of the next stimulus. However, sequential effects may exist only if the current decreases sufficiently rapidly, so that the trajectory is still significantly dependent on the state at the previous decision. This justifies the choice of exponential decrease of the inhibitory current (Eq. 7) and the numerical value of τ_{CD} = 200 ms. We note that recording from relay neurons, Sommer and Wurtz (2002) show that the signal corresponding to the CD last several hundreds of milliseconds. This time scale falls precisely in the range of values of the relaxation time constant of the model (Fig. 2*B*), and corresponds to values for which, as we will see, the model shows sequential effects.

#### Numerical simulations design and statistical tests

##### Numerical simulations.

The simulation of sequential decision-making is as follows: a stimulus with a randomly chosen coherence is presented until the network reaches a decision (decision threshold crossed). The decision is immediately followed by the removal of the stimulus, and a relaxation period during the RSI. Then a new stimulus is presented, initiating the next trial (Fig. 2*C*). The set of dynamical equations (Eqs. 1, 6), with the definitions (Eqs. 2, 5, 7–9), is numerically integrated using Euler–Maruyama method with a time step of 0.5 ms. At the beginning of a simulation, the system is set in a symmetric state *S _{L}* =

*S*=

_{R}*s*

_{0}, with low firing rates and synaptic activities,

*s*

_{0}= 0.1. We compute the instantaneous population firing rates, or the synaptic dynamical variables

*S*and

_{L}*S*, by averaging on a time window of 2 ms, slided with a time step of 1 ms. The accuracy of the network's performance is defined as the percentage of trials for which the units crossing the threshold corresponds to the stronger input. For data analysis we mainly work with the variables

_{R}*S*and

_{L}*S*, which are analog to the firing rates

_{R}*R*and

_{L}*R*[because they are monotonic functions of

_{R}*S*and

_{L}*S*(Wong and Wang, 2006) but are less noisy; Fig. 3]. We consider that the system has made a decision when for the first time the firing rate of one unit crosses a threshold θ, fixed at 20 Hz. The reaction time during one trial is defined as the time needed for the network to reach the threshold from the onset of the input stimulus. We neglect the possible additional time because of motor reaction and signal transduction. In addition to the reaction times, we compute the

_{R}*discrimination threshold*, which is linked to the accuracy. The definition is based on the use of a Weibull function commonly used to fit the psychometric curves (Quick, 1974). That is, one writes the performance (mean success rate) as follows: where α and β are parameters. Then, for

*c*= α, Perf(c) = 1 − 0.5 exp(− 1) ∼ 0.82.

Hence one defines the discrimination threshold as the coherence level at which the subject responds correctly 82% of the time.

We list in Table 1 the model parameters that correspond to the one of the simulations. For Figures 5 and 7 we have used continuous sequences of 1000 trials averaged over 24 independent simulations, allowing to more specifically compare with the experiments of Bonaiuto et al. (2016) done with 24 subjects. Figures 9 to 16 present results obtained for sequences of 1000 trials averaged over 50 independent simulations to allow for a better statistical analysis. The number of sequences, 1000, is a typical order of magnitude in experiments (Bonaiuto et al. 2016; Danielmeier and Ullsperger, 2011).

##### Statistical tests.

Following Benjamin et al. (2018), we consider a *p* value of 0.005 as a criterion for rejecting the null hypothesis in a statistical test. To assess whether the distributions of two continuous variables are different, we make use of the Kolmogorov–Smirnov test (Hollander et al., 2014), and in the case of discrete variable distributions we use the Anderson–Darling test (Shorack and Wellner, 2009). For very large samples, we use the energy distance (Rizzo and Székely, 2016), which is a metric distance between the distributions of random vectors. We use the associated E-statistic (Szekely and Rizzo, 2013) for testing the null hypothesis that two random variables *X* and *Y* have the same cumulative distribution functions. For testing whether the means of two samples are different we make use of the unequal variance test (Welch's test; Hollander et al., 2014).

##### Software and code accessibility.

For the simulations we made use of the Julia language (Bezanson et al., 2017). The code of the simulations can be obtained from the corresponding author upon request. We made use of the XPP software (Ermentrout and Mahajan, 2003) for the phase-space analysis and the computation of the relaxation time constant of the dynamical system. Figures 9, 10, 11, 12, and 19 were realized using Python and the other are in the same language as the simulations. The E-statistics tests were performed using the R-Package: *energy package* (Rizzo and Székely, 2014).

## Results

### Sequential dynamics and choice repetition biases

The dynamical properties described above give that, for the appropriate parameter regime, the RSI relaxation leads to a state which is between the previous decision state and the neutral attractor. If it is still within the basin of attraction of the previous decision state at the onset of the next stimulus, one expects sequential biases. This mechanism is similar to the one discussed by Bonaiuto et al. (2016). However, the relaxation mechanisms are different, as discussed in the Introduction. This results in different quantitative properties, notably and quite importantly in the time scale of the relaxation, which is here more in agreement with experimental findings (Cho et al., 2002).

We will specifically show that nonlinear dynamical effects are at the core of post-error adjustments. As a preliminary step, it is necessary to investigate the occurrence of sequential effects in our model. We do so by describing more precisely the intertrial dynamics: we need to specify where the network state lies at the onset of a new stimulus, with respect to the boundaries between the basins of attraction. We take advantage of this analysis to explore response repetition bias as studied by Bonaiuto et al. (2016), and to confront the model behavior with other empirical findings (Laming, 1979a; Cho et al., 2002). In all the following, we study the model properties in function of the two parameters, the amplitude of the corollary discharge, *I*_{CD,max}, and the duration of the RSI.

### Network behavior: reaction times biases

After running simulations of the network dynamics with the protocol of Figure 2*C*, we analyze the effects of response repetition by separating the trials into two groups, the Repeated and Alternated cases. The repeated (respectively alternated) trials are those for which the decision is identical to (respectively, different from) the decision at the previous trial. Note that we do not consider whether the stimulus category is repeated or alternated: the analysis is based on whether the decision is identical or different between two consecutive trials (Fleming et al., 2010; Padoa-Schioppa, 2013). Such analysis is appropriate, because the effects under consideration depend on the levels of activity specific to the previous decision. We run a simulation of 1000 consecutive trials, each of them with a coherence value randomly chosen between 20 values in the range (−0.512, 0.512). We do so for two values of the CD amplitude, *I*_{CD,max} = 0.035 nA and *I*_{CD,max} = 0.08 nA, with a RSI of 1 s, the other parameters being given on Table 1.

We find that the distribution of coherence values are identical for the two groups, for both values of *I*_{CD,max} (Anderson–Darling test, *p* = 0.75 and *p* = 0.84, respectively). We study the reaction times separately for the two groups, and present the results in Figure 5. In Figure 5*C* we represent the so called energy distance (Szekely and Rizzo, 2013; Rizzo and Székely, 2016) between the repeated and alternated reaction-time distribution. As we can observe, the distance decreases, hence the sequential effect diminishes, as the CD amplitude *I*_{CD,max} increases. For the specific case of Figures 5, *A* and *B*, the corresponding E-statistic for testing equal distributions leads to the conclusion that in the case *I*_{CD,max} = 0.035 nA, the two reaction-time distributions are different (*p* = 0.0019). This implies that the behavior of the network is influenced by the previous trial. We observe a faster mean reaction time (∼55 ms) when the choice is repeated (Fig. 5*A*), with identical shape of the reaction times distributions. The difference in means is of the same order as found by Cho et al. (2002) in experiments on 2AFC tasks. On the contrary, for *I*_{CD,max} = 0.08 nA (Fig. 5*B*), the two histograms cannot be distinguished (E-statistic test, *p* = 0.25).

We have checked that increasing the RSI has a similar effect to increasing the CD amplitude. We observe sequential effects for RSI values in the range 0.5–5 s, in accordance with two-choice decision-making experiments, where such effects are observed for RSI <5 s (Rabbitt and Rodgers, 1977; Laming, 1979a; Soetens et al., 1985).

### Neural correlates: dynamic analysis

With the relaxation of the activities induced by the CD, the state of the network at the onset of the next stimulus lies in-between the attractor state corresponding to the previous decision, and the neutral attractor state. When averaging separately over repeated and alternated trials, we find, as detailed below, that this relaxation dynamic has different behaviors depending on whether the next decision is identical or different from the previous one. Note that this is a statistical effect which can only be seen by averaging over a very large number of trials.

In Figure 6 we compare two examples of network activity, one with an alternated choice, and one with a repeated choice, by plotting the dynamics during two consecutive trials. We observe in Figure 6*A*, the alternated case, that previous to the onset of the second stimulus (light blue rectangle) the activities of the two populations are at very similar levels. In contrast, for the case of a repeated choice, Figure 6*C*, the activities are well separated, with higher firing rates.

In Figure 6*B* we give a classical phase-plane representation of the network dynamics during two consecutive trials, with the axes as the synaptic activities of the winning versus loosing neuronal populations in the first trial. One sees a trajectory starting from the neutral state, going to the vicinity of the attractor corresponding to the first decision, and then relaxing to the vicinity of the neutral state (as illustrated in Fig. 4*G*). Then the trajectory goes toward the attractor corresponding to the next decision, different from the first one. This aspect of the dynamics is similar to what is obtained by Gao et al. (2009) with another type of attractor network. We show in Figure 6*D* the phase-plane dynamics in the case of a repeated choice (trajectory in blue). On this same panel, for comparison we reproduce in light red the dynamics, shown in Figure 6*B*, during the first trial in the alternated case. As can be seen in Figure 6*D*, the network states at the time of decision are different depending on whether the network makes a decision identical to, or different from, the one made at the previous trial.

To check the statistical significance of these observations, we represent in Figure 7 the mean activities during the RSI, obtained by averaging the dynamics over all trials, separately for the alternated and repeated groups. As expected, for small values of *I*_{CD,max} (0.035 nA), the two dynamics are clearly different. This difference diminishes during relaxation. However at the onset of the next stimulus we can still observe some residues, statistically significant according to an Anderson–Darling test done on the 500 ms before the next stimulus (between winning population, *p* = 0.0034, between losing population *p* = 3.2 × 10^{−8}).

Looking at Figure 7*A*, we observe that the ending points of the alternated and repeated relaxations are biased with respect to the symmetric state. At the beginning of the next stimulus the network is already in the basin of attraction of the repeated case. Hence, it will be harder to reach the alternated attractor stated (in the green region). When increasing *I*_{CD,max} (Fig. 7*B*), we observe that the ending state of the relaxation is closer to the attractor state. Hence, the biases in sequential effects disappear because at the beginning of the next stimuli the network starts from the symmetric (neutral) state. The same analysis holds for longer RSI, the dynamics are almost identical (Anderson–Darling test: between winning population, *p* = 0.25, between losing population *p* = 0.4), and both relaxation end near the neutral attractor state. The bias depending on the next stimulus is not observed anymore, and the sequential effect on reaction time hence disappears.

Note that the sequential effects only depend on whether or not the states at the end of the relaxation lie on the basin boundary. However, we have just seen that the effects can also be observed at the level of the relaxation dynamics, because the trajectories for alternated and repeated cases are identical when there is no effect, and different in the case of sequential effects.

The analysis of the dynamics also leads to expectations for what concerns the bias in accuracy toward the previous decision. Indeed, this can be deduced from Figure 7. If the choice at the previous trial was *R* (respectively *L*), then, at the end of the relaxation, the network lies closer to the basin of attraction of attractor *R* (respectively *L*). Thus when presenting the next stimulus, the decision will be biased toward the previous state, so that the probability of making the same choice will be greater than the one of making the opposite choice. Otherwise stated, given the stimulus presented at the current trial, the probability to make the choice *R* will be greater when the previous choice was also *R*, than when the previous choice was *L*. Numerical simulations confirm this analysis, as illustrated on Figure 8. The RSI dependency is statistically significant (generalized linear model: *r* = −3.9, *p* < 0.0001). For small RSI (500 ms), the decision is biased toward the previous one, and for RSI of several seconds this effect disappears. These results are in agreement with experimental findings of Bonaiuto et al. (2016). The authors studied response repetition biases in human with RSIs of at least 1.5 s. In these experiments, they measure the Left–Right indecision point, that is the level of coherence resulting in chance selection. Compared with the repeated case, they find that the indecision point for the alternated case is at a higher coherence level, and this shift decreases as the RSI increases.

Sequential decision effects have also been analyzed within the drift-diffusion model (DDM) framework (Farrell and Ludwig, 2008; Goldfarb et al., 2012). Behavioral data can be fitted by different choices of starting points, and possibly of thresholds (Goldfarb et al., 2012). The modification of the starting point in a DDM framework is analog to the effect of the relaxation in our model. However, most works based on DDM make a *post hoc* analysis of empirical data, with separate fits for alternated and repeated cases.

To conclude this section, at the time of decision, the winning population has a firing rate higher than the losing population. After relaxation, at the onset of the next stimulus, the two neural pools have more similar activities, but are still sufficiently different, that is the dynamics is still significantly away from the neutral attractor. At the onset of the next stimulus, the systems finds itself in the basin of attraction of the attractor associated to the same decision as the previous one. This results in a dynamical bias in favor of the previous decision. The probability to make the same choice as the previous one is then larger than the one of the other choice, and the reaction time, for making the same choice (repeated case), is shorter than for making the opposite choice (alternated case). In accordance with these results, studies on the LIP, SC, and basal ganglia have found that the baseline activities before the onset of the stimuli can reflect the probabilities of making the saccade, under specific conditions (Lauwereyns et al., 2002; Ding and Hikosaka, 2006; Rao et al., 2012). Our model shows that these modulations of the baseline activities can be understood as resulting from the across-trial dynamics of the decision process.

### Post-error effects

#### Post-error adjustments on reaction times

The most interesting and well established effect is the one of PES (Laming, 1979a; for review, see Danielmeier and Ullsperger, 2011). It consists of prolonged reaction times in trials following an error, compared with reaction times following a correct trial. This effect has been observed in a variety of tasks: categorization (Jentzsch and Dudschig, 2009), flanker (Debener et al., 2005), and Stroop (Gehring and Fencsik, 2001) tasks. Jentzsch and Dudschig (2009) and Danielmeier and Ullsperger (2011) found that the PES effect depends on the RSI. The amplitude of this effect, defined as the difference between the mean reaction times of post-error and post-correct trials, decreases as one increases the RSI, with values going from several dozens of milliseconds to zero. For RSIs >750–1500 ms, PES is not observed anymore. Remarkably, the PES effect is reported in cases where the subject does not receive information on the correctness of the decision (Jentzsch and Dudschig, 2009; Danielmeier and Ullsperger, 2011; Danielmeier et al., 2011). Moreover, this effect is automatic and involuntary (Rabbitt, 2002), and is independent of error detection and the correction process, which involve other cortical areas (Rodriguez-Fornells et al., 2002). This suggests a rather low level processing origin in line with the present model.

In this section we investigate the occurrence of post-error adjustments in our model. We confront the results to empirical findings from various behavioral experiments with TAFC (and marginally also 4-AFC) protocols in which, as it is also the case in our model, there is no feedback on the correctness of the decision. We will notably discuss the model predictions comparing the results with those of Danielmeier and Ullsperger (2011) who studied the dependence of PES with respect to the RSI, as well as the relation between PES and PIA.

We studied the occurrence of the PES effect in the model with respect to the coherence level and *I*_{CD,max}, at an intermediate RSI value of 500 ms, leading to the phase diagram in Figure 9*A*. We find a large domain in parameter space showing PES effect (Fig. 9, red). Figure 9*B* zooms on a value of *I*_{CD,max} for which PES occurs (*I*_{CD,max} = 0.035 nA). We observe that the magnitude of the PES effect goes from 0 to 10 ms at *c* = 10%, hence remaining within the range of behavioral data (Jentzsch and Dudschig, 2009; Danielmeier and Ullsperger, 2011; 10–15 ms for a RSI of 0.5–1 s). In these experiments (a flanker task with stimuli belonging to 1 of 2 opposite categories, Left or Right directions), the ambiguity level is not quantified. However, the observed error rates are found ∼10%, which, within our model, corresponds to a coherence level of about *c* = 10%. On the phase diagram, one can observe the variation of the PES effect with respect to the coherence level. In the region where we observe a PES effect, we find that it is enhanced under conditions when errors are infrequent. However, for large values of the coherence level, this effect cannot be observed anymore because of the absence of any error in the successive trials (∼100% of correct trials). This occurrence of PES, principally at low error rates, has been found in experiments of Notebaert et al. (2009); Núñez Castellar et al. (2010), for which the authors observe PES when errors are infrequent, but not when errors are frequent. Note that these experiments are with 4-AFC tasks, but we expect the same type of properties as for TAFC tasks, and the model could easily be adapted to such cases with a neural pool specific to each one of the four categories.

The phase diagram, Figure 9*A*, also shows parameter values with no effect at all (gray), and a domain with the opposite effect, that is with reaction times faster after an error than after a correct trial (blue). We propose to call this effect *post-error quickening* (PEQ), as opposed to PES. As shown in Figure 9*C*, we find that, for a given value of *I*_{CD,max}, one can have PES at low coherence level, and PEQ at high coherence level.

This PEQ effect, although much less studied, has been observed in various AFC experiments, either without feedback (Rabbitt and Rodgers, 1977; Notebaert et al., 2009; King et al., 2010) or with feedback (Purcell and Kiani, 2016), notably for fast-response regimes (Notebaert et al., 2009; King et al., 2010). The conditions for observing PEQ remain however not well established, with some contradictory results. We note that with go/no-go protocols (which are similar to AFC protocols in many respects), Hester et al. (2005) report post-error decrease in reaction times for aware errors, but not for unaware errors, whereas Cohen et al. (2009) on the contrary reports no PEQ effect, but a larger PES effect for aware errors than for unaware errors. The fact that the model predicts PEQ in TAFC tasks at high coherence levels is more in line with the results of the fMRI experiments of Hester et al. (2005). Indeed, at high coherence levels, responses are fast and most often correct. In the rare case of an error, the subject is likely to become aware that an error has been made (Yeung and Summerfield, 2012). This thus may lead to a correlation (without causal links) between aware errors and PEQ.

We also studied the RSI dependency of the PES effects by plotting the phase diagram at *I*_{CD,max} = 0.045 nA with respect to the RSI (Fig. 10). In behavioral experiments the PES effect depends strongly on the RSI. For RSIs >1000–1500 ms the observation or not of PES depends specifically on the decision task (Jentzsch and Dudschig, 2009; King et al., 2010). However, a common observation is that, whenever PES is observed, if one keeps increasing the RSI, the PES effect eventually disappears. In Figure 10, we observe that, for parameters where PES is observed at a RSI of 500 ms, increasing the RSI leads to the weakening of the PES effect until its disappearance. At a RSI of 1000–1500 ms this effect is not present anymore, in agreement with experimental results (Jentzsch and Dudschig, 2009).

The variation of PEQ with respect to RSI has not been experimentally studied, as this effect is more controversial. However, our model shows that the dependence on RSI is similar to the one of PES, and predicts that when both effects exists at a same RSI value (for different coherence levels), increasing the RSI leads to the disappearance of both of them.

We note here that the set of phase diagrams that we present in this work on the various effects, Figures 9–12, provide testable behavioral predictions. As just discussed in the particular case of PES and PEQ, they predict how the effects on reaction times are or are not correlated, and in particular how they qualitatively depend on, and covary with, the coherence level or the duration of the RSI.

### Post-error improvement in accuracy

PIA is another sequential effect reported in experiments (Laming, 1979a; Marco-Pallarés et al., 2008; Danielmeier and Ullsperger, 2011). PIA has been observed on different time-scales: long-term learning effects following error (Hester et al., 2005) and trial-to-trial adjustments directly after commission of error responses. We only consider this latter type of PIA. The specific conditions under which PIA can be observed in behavioral experiments have not been totally understood. We investigate this effect in the specific context of our model, considering that the strength of the effect is linked to the difference in error rates between post-error and post-correct trials.

In Figure 11 we represent the phase diagram of the PIA effect with respect to coherence levels (*x*-axis) and corollary discharge amplitude (*y*-axis). We denote a large region of parameters for which PIA is present. We find a magnitude of the PIA effect of ∼2–4%, which is of the same order of magnitude as in the experiments where, for RSIs in the range 500–1000 ms, it is found that post-error accuracy is improved by ∼3% (Jentzsch and Dudschig, 2009).

Looking at Figure 11, one sees that the PIA and PES effects append in the same region of parameters. However, if we zoom in on specific regions (Fig. 11*B*,*C*), we can notice some differences in the variation of these effects. The black dashed rectangular regions correspond to the same parameters as in Figure 9. We first note that PIA is also observed in these regions. However, we observe a decrease of PES at very large coherence (Fig. 9*B*), but not of PIA (Fig. 11*B*). Moreover the decrease of the PIA effect in Figure 9*C* does not occur at the same values of parameters as for the PES one. It would be tempting to interpret PIA as a better accuracy resulting from taking more time for making the decision. This is not the case, because PIA does not appear uniquely in the PES region, but in the PEQ one too. In agreement with these model predictions, Danielmeier et al. (2011), in a TAFC task with color-based categories, observe that PIA can occur in the absence of PES, but that the occurrence of PES is always associated with PIA (except for 1 subject among 20, results reported by Danielmeier and Ullsperger, 2011), their Fig. 1).

In EEG experiments, Marco-Pallarés et al. (2008) find that time courses of PES and PIA seem to be dissociable as they observe post-error improvements in accuracy with longer intertrial intervals (up to 2250 ms) than PES. We note that these authors consider protocols with and without stop-signals, and here we are only concerned by those without. We investigate the variation with respect to the RSI of PIA in our model (Fig. 12). We note that, for long RSIs, the PIA effect is not observed anymore. However as observed by Marco-Pallarés et al. (2008), the PIA effect occurs for longer RSIs than the PES effect (Fig. 11*A*). In the same way, PIA is more robust with respect to the intensity of the corollary discharge. This is corroborated by Figure 13, *A* and *B*, which represents PES and PIA effect for a larger relaxation time, τ_{CD} = 500 ms, hence with a stronger CD. We note that all the regimes previously observed are present, for slightly different parameter ranges. This shows that the global picture illustrated by the phase diagrams, Figures 9 and 11, is not specific to a narrow range of *I*_{CD,max} and τ_{CD} values.

Verguts et al. (2011) find that PIA and PES seem to happen independently, suggesting that at least two post-error processes takes place in parallel. An important outcome of our analysis is to show that PIA and PES effects can both result from the same underlying dynamics. In addition, in the parameters domain where they both occur, we find that the variations of these effects with respect to the coherence levels are indeed uncorrelated (Pearson correlation test: RSI of 500 ms and *I*_{CD} = 0.035 nA, *p* = 0.58, *I*_{CD} = 0.05, *p* = 0.79 and *I*_{CD} = 0.1 nA, *p* = 0.25; RSI of 2000 ms and *I*_{CD} = 0.035 nA, *p* = 0.37). This non-correlation highlights the complexity of such post-error adjustments, as explored by Verguts et al. (2011).

To gain more insights into the PIA effect, we study the discrimination threshold following an error or a success, with respect to the RSI (Fig. 12*B–D*). In Figure 12*B* we represent the distribution of the discrimination threshold for *I*_{CD,max} = 0.035 nA and a RSI of 500 ms. For these parameters, the distributions for the post-error and post-success cases are highly different (Smirnov–Kolmogorov test: *p* < 10^{−20}). If one increases the RSI (Fig. 12*C*, 1000 ms, *D*, 1500 ms), this difference disappears (Smirnov–Kolmogorov test: *p* = 0.038 and *p* = 0.4, respectively). However, we note that the model predicts a wider distribution of the discrimination threshold after an error than after a correct trial, independently of the presence of the PIA effect. This might result from the wider distribution in the neural (or synaptic) activities after an error that we discuss *n* the next section. To our knowledge, this effect has not been studied in behavioral experiments.

### Dynamical analysis

In this section we analyze the PES and PEQ effects in terms of neural dynamics. First of all, we represent and discuss the dynamics on individual trials for the three regions of parameters: with neither PES nor PEQ effects, with PES effect, and with PEQ effect (Fig. 14). We observe the dynamics for post-error and post-correct trials during the relaxation period following a decision and during the presentation of the next stimulus. Already on individual trials we notice differences between the regions. Figure 14*A* represents a trial in the region without PES or PEQ. The post-error/correct dynamics are indistinguishable. Hence we do not observe any differences in the reaction times. Looking at a trial in the PEQ region (Fig. 14*B*), we notice that the population *L* (the winning one for the second stimulus) for the post-error case seems a bit higher in activity than for the post-correct case. This leads to the post-error quickening effect, as the post-error (orange) curve will reach the threshold sooner than the post-correct (blue) one. Finally, Figure 14*C* represents individual trials for parameters in the PES region. In the phase diagram (Fig. 9) the effect was more pronounced than PEQ, thus it is more pronounced on the dynamics too. During the relaxation, and the presentation of the next stimulus, the post-correct dynamics (blue curve) for population *L* (the winning one for the second stimulus) is higher than the post-error one. As we can observe this leads to a faster decision time for the post-correct trial than for the post-error one.

We show now that the dynamics explains the three effects PES, PEQ and PIA. We provide in Figures 16–18 a semiqualitative and semiquantitative analysis of the dynamics of the synaptic activities in the phase plane of the system, for several parameter regimes. Here again, the analysis is easier working on the synaptic activities. This can be seen by considering Figure 15 on which we represent the mean firing rate and synaptic activity of the winning population in the PES case. Because of the range of variation of the firing rates, and the intrinsic noise of the system, it is hard to observe a difference between the neural activities. However, if we compute this difference (Fig. 15, inset) we note the following. At the beginning of the next trial, the difference between the post-error and post-correct firing rates is significantly below zero, hence the reaction time will be shorter for post-correct than for post-error trials. We find the same behavior for the synaptic activities (Fig. 15*B*), but much less noisy, as expected from the discussion in Materials and Methods.

#### PES effect

We now detail the analysis of the PES effect (and of the concomitant PIA effect) based on Figure 16. Let us first explain how each panel is done. Without loss of generality, we assume that the last decision made is *R*. Repeated and Alternated cases thus correspond to next trial decisions *R* and *L*, respectively. The *x*- and *y*-axes are the synaptic activities *S _{L}* and

*S*, respectively; hence, the losing and winning populations for the first trial.

_{R}On the left, we represent with dashed lines the average dynamics during the relaxation period, that is from the decision time for the previous stimulus to the onset of the next stimulus. This allows to identify clearly the typical neural states at the end of the relaxation period. The average is done over post-error (respectively, post-correct) trajectories sharing a same state at the time of the last decision. The choice of these two initial states is based on the following remark. A typical trial with a correct decision will lead, at the time of decision, to losing and winning populations with highly different activity rates, hence a neural activity, and thus a synaptic activity *S _{L}*, far from the threshold value. On the contrary, a typical error trial will show a losing activity not far from the threshold; this can also be observed in the study byWong et al. (2007, their Fig. 4

*B*). We can thus represent post-correct trials, respectively post-error trials, by dynamics with initial states having a rather small, respectively large, value of

*S*(and in both cases the first trial winning population

_{L}*S*at threshold value).

_{R}We then represent with a continuous line the average trajectory following the onset of the next stimulus. We observe this dynamics during the same time for post-error and post-correct cases, as if there were no decision threshold, to compare the dynamics of post-error and post-correct cases for the same duration of time. Decision actually occurs when the trajectory crosses the decision line (dashed gray line); this is approximate: because of the noise, there is no one to one correspondence between a neural activity reaching the decision threshold and a particular value of the associated synaptic activity. Having all the trajectories plotted for the same duration (and not only until the decision time) allows to visually compare the associated reaction times.

On the right, we represent typical trajectories during the presentation of the next stimulus. The black dot on every panel gives the location of the neutral attractor that exists during the relaxation dynamics. The basins of attractions that are represented are the ones associated with the attractors, *L* and *R*, of the dynamics induced by the onset of the next stimulus. Be reminded that these attractors are different from the ones associated to the dynamics during the relaxation period.

We can now analyze the dynamics. In the repeated case (Fig. 16*A*,*B*), at the end of the relaxation (that is at the onset of the next stimulus), both post-correct and post-error trials lie into the correct basin of attraction. Hence, the error rates for these trials are similar. However, the neural states reached at the end of the relaxations are different. Compared with the post-error trial, the post-correct state is closer to the boundary of the new attractor associated to decision *R*, and the corresponding decision will thus be faster. In the alternate case (Fig. 16*C*,*D*), the states reached at the end of the relaxation period do not lie within the correct basin of attraction. During the decision-making dynamics, the trajectory needs to cross the boundary between the two basins of attraction. The post-correct trials leading to an alternate decision have rather straight dynamics across the boundary, leading to relatively fast decision times. In contrast, the states at the onset of the stimulus of the post-error trials are closer to the boundary so that the corresponding trajectories cross with a smaller angle with respect to the basin boundary. This leads to longer reaction times, hence the PES effect. It would be interesting to have electrophysiological data with which the model predictions could be directly compared. However, in a typical experiment on monkeys, a feedback on the correctness of the decision is given, because the animal learns the task thanks to a reward-based protocol. Nevertheless, we note that, in the random-dot experiments on monkeys by Purcell and Kiani (2016), the authors find a higher buildup rate of the neural activity for post-correct trials than for post-error trials (Purcell and Kiani (2016), their Fig. 6). Within our framework, this can be understood as trajectories that cross the basin boundary more quickly for post-correct trials, in accordance with our model's predictions. This suggests that the observed difference in buildup rates may not result from some mechanism making use of the information on the correctness of the decision, but rather from the nonlinear dynamics discussed here.

The PIA is understood from the same analysis as for the PES effect. For specific realizations of the noise that lead to error trials, the post-error trials dynamics is closer to the boundary. Thus it has a higher probability to fall on the other side of the basin of attraction. Hence, the error rates are lower for post-error trials than post-correct trials.

#### PEQ effect

The PEQ effect can be understood from the same kind of analysis, based here on Figure 17 (analogous for the PEQ effect to Fig. 16 for the PES effect). As seen previously, the PEQ effect occurs mostly at high level of coherence. We consider first the repeated case (Fig. 17*A*,*B*). Because the coherence level is high, at the end of the relaxation period, both post-correct and post-error trials lie within the correct basin of attraction, far from the basin boundary. The reaction times and error rates of post-correct and post-error trials for repeated decisions are thus similar.

In contrast, the alternated case (Fig. 17*C*,*D*) exhibits both the PIA and the PEQ effects. The post-error's end of relaxation now is inside the basin of attraction of the alternated choice. Hence, the error rate will be lower than when the ending point is outside this region (post-correct trials begin at the boundary of the basin of attraction). Moreover, the post-correct trials dynamics have to cross the boundary. Hence they are closer to the manifold, which lead to slower dynamics, whereas the post-error dynamics can directly reach the new attractor state. This analysis explains why the decreasing of PES and PIA do not occur at the same coherence level too. Indeed the decreasing of PIA occurs when the ending point of the post-error relaxation crosses the boundary, whereas the post-correct ending point remains into the same basin of attraction. For the PES effect to decrease, the dynamics for both cases just need to be closer to the boundary and not necessarily on the opposite side. Hence the decrease of the PES effect occurs at lower coherence than the PIA one.

Here we have seen that the occurrence of the PEQ effect depends on some very specific and fragile feature, the crossing or not of a basin boundary. The conditions for observing the effect are thus likely to vary from individual to individual, and from experiment to experiment. This may explain why the experimental results about the PEQ effect remain controversial.

In Figure 18, *A* and *B*, we investigate the parameter regime, at low coherence level, for which there is no effect; neither PES, nor PEQ or PIA. The post-error and post-correct dynamics are very similar and lead to the same relaxation ending point, far from the basin boundary. Finally, in Figure 18, *C* and *D*, we consider the parameter regime, at high coherence level, with only the PIA effect. Here the relaxations of post-error and post-correct trials are different. However, as for the PEQ effect, at high coherence level both dynamics will be fast. For alternated trials, none of the two ending points are in the correct basin of attraction.

As discussed for the PES effect, electrophysiological data only exist for experiments with feedback on the correctness of the decision. In experiments on monkeys, Purcell and Kiani (2016) obtain puzzling results for what concerns the PEQ effect. They observe an important difference in baseline activities for post-correct and post-error trials, which is not well accounted for either by their DDM analysis or by our model. However, in terms of neural dynamics, this observed difference in the level of neural activities obviously implies that the dynamical states are different at the time of the onset of the stimulus, a fact in agreement with our model's predictions. One may wonder if the separation in baseline activities, and not just in starting points, could be a consequence of the feedback.

#### Correlating post-error effects with the activity distributions at the previous decision

To go beyond the above analysis on the post-error adjustments (PES, PEQ, and PIA effects), we analyze the respective influence of the winning and losing population levels of activity at the time of the previous decision, onto the decision at the next trial. This will first confirm the previous analysis, but also provide more insights on the specificity of the two opposite effects, PES and PEQ.

The mean activity, at the time of the decision, of the winning population is indistinguishable between post-correct and post-error trials [unequal variance (Welch) test: fail to reject, *p* = 0.16 at RSI of 500 ms; fail to reject, *p* = 0.87 at RSI of 2000 ms]. However, for short RSIs (corresponding to PES regime) the mean synaptic activities, at the time of the decision, of the losing population are different for post-correct and post-error trials are different for post-correct and post-error trials [unequal variance (Welch) test: reject, *p* = 2.7 × 10^{−20} at RSI of 500 ms; fail to reject, *p* = 0.57 at RSI of 2000 ms].

To gain more insight, we plot in Figure 19 the amplitude of the PES effect with respect to the interpercentile range of the distribution of the synaptic activities of the winning and losing populations at the time of the previous decision. We note that when PES occurs, the higher the activity of the losing population at the time of decision, the stronger this effect will be. The influence of the winning population is observed, although in an opposite way. When PES occurs these effects are correlated (dark blue: Pearson correlation: *r* = −0.98, *p* = 2.6 × 10^{−7}; medium blue: *r* = −0.98 and *p* = 9.5 × 10^{−7}), in the sense that the variations with respect to the interpercentile of the winning and losing population are correlated. These observations are consistent with the analysis of the PES phase-plane trajectories. Indeed, the higher the losing population activity is, the closer to the invariant manifold the state at the end of the relaxation period will be. Hence, the effect will be stronger as it becomes easier (more likely) to cross the boundary.

However, we observe in Figure 19, *A* and *C*, a different behavior for the PEQ effect: there is an almost constant value of the PEQ effect with respect to the interpercentiles of the distributions of the winning and losing populations activities. This is explained by the fact that, at the end of the relaxation, if the category of the next stimulus is the opposite of the previous decision, the network state finds itself within the (correct) associated basin of attraction, but very close to the boundary. This is true whatever the correctness of the previous decision. However, the post-correct case will lead to an even closer location from the basin boundary. The nonlinearity of the dynamics near the basin boundary will strongly amplify the small difference between post-correct and post-error ending point. The PEQ effect will thus not be correlated with the size of this difference.

For what concerns the PIA effect, we observe in Figure 19, *C* and *D*, a similar dependency in the synaptic activities as for the PES effect, with a stronger effect for high activities of the losing population. This corroborates the above phase plane analysis of the trajectories (Fig. 13). Indeed, the PES and PIA effects both depend on the position of the relaxation in the phase plane. Being closer to the boundary (high activity of the losing population) leads to a smaller error rate in the next trial.

From the above analysis, a prediction of the model is that, whenever there are PES or PIA effects, the mean activity of the losing population is different for correct and error decisions. Moreover, this level of activity is correlated with the amplitude of the post-error adjustment effect. This can be seen in Figure 19, *A* and *B*. In this figure we present the quantiles of the synaptic activities. The results would be similar, but much more noisy, for the firing rates. We expect that this prediction can be tested in experiments by measuring the correlation between the amplitude of the PES (or PIA) effect, and the difference in mean activities of the losing neural population (difference between post-error and post-correct trials).

## Discussion

We have shown that, without fine tuning of parameters, an attractor neural network accounts, qualitatively and with the correct orders of magnitude, for sequential effects and post-error adjustments reported in TAFC experiments in the absence of feedback about the correctness of the decision.

We provide evidence that these effects all result from the same intrinsic properties of the nonlinear neural dynamics. We present in Figure 20 a schematic diagram of the occurrence of the effects depending on the parameters, even though this does not exhaust the richness of the systems behavior as discussed in this paper. Our results suggest to test experimentally this general picture, and more precisely what is predicted by the phase diagrams, Figures 9–12. In particular it would be interesting to test the occurrence of post-error quickening at large coherence level or the variations of post-error adjustments with respect to coherence levels.

### Explanations for PES

Several cognitive explanations of PES effects have been proposed (Rabbitt and Rodgers, 1977; Laming, 1979b; Notebaert et al., 2009).

In particular, these effects have been discussed in the framework of DDMs (Dutilh et al., 2012; Goldfarb et al., 2012; Purcell and Kiani, 2016). Dutilh et al. (2012), in experiments without feedback about the correctness of the decision, and Purcell and Kiani (2016), but in experiments with feedback, show that post-error and post-correct trials can be fitted by DDMs with different sets of parameter values for post-error and post-correct trials. In addition, Dutilh et al. (2012) argue that the modification of the decision threshold within the DDM framework, would correspond to the hypothesis of increased response caution, the decision becoming more cautious after an error. Yet, the neural correlates, which would determine the threshold or the starting point remain obscure, especially in the absence of feedback on the correctness of the trial.

Within the attractor network framework considered here, the PES and PEQ effects are explained thanks to an in-depth analysis of the neural dynamics. We have shown that the location of the dynamical state at the end of the relaxation period (end of the RSI), with respect to the basins of attraction of the attractors induced by the next stimulus, depends on what occurred at the previous trial. The fact that we have different properties, e.g., for post-correct and post-error trials, for a same set of parameter values, is a result of the nonlinear dynamics which amplifies the difference in ending points of the relaxation. This cannot be obtained within the DDM framework (without the addition of other mechanisms) because, in a DDM, the state reached at the time of a decision is identical for an error and a correct trial. An additional outcome of the analysis is that, for a given set of parameter values, different regimes (PES, PEQ, or no effect) may be observed depending on the coherence level of the stimulus: because of the nonlinearities, the dynamical state at the end of the RSI also depends on the coherence level.

Typical experiments on monkeys make use of reward-based protocols, hence with feedback. This makes difficult to have electrophysiological data in the absence of feedback. Yet, as discussed in this paper, the faster buildup of neural activity in post-correct trials than in post-error trials, as observed by Purcell and Kiani (2016) on monkeys in random-dot experiments, can be understood within our framework as a faster dynamics near the boundary between attraction basins in the post-correct case.

As discussed above, another prediction of the model is that, in the case of PES or PIA, the mean activity of the losing neural population is different for correct and error decisions, a difference which should correlate with the amplitude of the effect.

### First- and higher-order sequential effects

Sequential effects can be categorized as first order (if caused by the immediately previous trial), or higher order (if caused by earlier trials in the sequence; Laming, 1979a; Soetens et al., 1984, 1985; Cho et al., 2002). Post-error adjustments have also been experimentally observed at higher order (Laming, 1979a).

Within the framework of attractor networks, the sequential effects in choice repetitions are explained by a starting bias, as discussed by Gao et al. (2009) and Bonaiuto et al. (2016), and in the present paper. As stated by Gao et al. (2009), without any additional memory module, an attractor network cannot reproduce the transition between automatic facilitation and strategic expectancy (Laming, 1968). In our network, for too short RSIs (a few dozens of milliseconds) the sequential effects are too strong to be plausible. Decision conflict mechanisms (Jones et al., 2002) could be implemented to correct this effect and to investigate other effects of repetitions and alternations (Gao et al., 2009).

To account for higher-order effects, Gao et al. (2009) considered a dynamical network making use of additional memory modules. This network is explicitly set up to reproduce automatic facilitation and strategic expectancy effects. In this model, even the first-order effects result from a coupling between a short-term memory module and the decision network. In contrast, we have shown here that a single attractor network, without memory units, presents first-order effects as an intrinsic property of the dynamics.

However, because of the nature of the dynamics in our model, we do not expect to reproduce higher-order effects. Indeed, for parameters for which the model exhibits first-order sequential effects (*I*_{CD,max} = 0.035 nA), we find neither second-order sequential effects, nor post-error adjustments, as illustrated in Figure 13*C* and *D*.

One may ask whether a more complex architecture, taking into account other brain areas, could account for higher-order repetition biases and post-error adjustments effects as resulting from some intrinsic properties of the dynamics, in the absence of specific memory units.

### Working memory and decision-making

In this work we have considered free response time task (Roitman and Shadlen, 2002) in which the subject must make a decision as soon as possible. In the different protocol delayed visual motion discrimination experiment (Shadlen and Newsome, 2001), the subject must make the decision at a prescribed time after the onset of the stimulus. In such task, the decision choice must be stored to be retrieved at the prescribed instant of time. In the original attractor neural network model (Wang, 2002), the decision is stored as in a working memory. As discussed at the beginning of this paper, within the framework of a single module of attractor decision network, the CD considered in the present paper allows the system to make successive decisions, at the price of removing the working memory behavior. An important issue is to understand how the decision-making system can adapt itself to these opposite contexts (for a model with gain modulation, see Niyogi and Wong-Lin, 2013). It is not unrealistic to expect a control mechanism onto the inhibitory current. Depending on the task, the inhibitory current could be sent either just after the decision has been made, or later after the end of the delay period. In the latter case, a prediction is that, compared with cases without delay, there should be weaker post-error effects, but stronger repeated/alternated effects.

An alternative is to have a more complex architecture. However, the memory units used by Gao et al. (2009) are not appropriate for dealing with delayed discrimination experiments. For experiments with delays, Murray et al. (2017) consider two interacting modules, one implementing the posterior parietal cortex and another one the posterior frontal cortex. It will be interesting to extend the present work by adding a working memory module in line with Murray et al. (2017), to obtain a network performing sequential decision-making while keeping the working memory behavior.

Finally we note that various brain areas have been shown to be involved in sequential decision tasks in which the memory of the last decision has to be maintained (Middlebrooks and Sommer, 2012; Donahue et al., 2013; Abzug and Sommer, 2018). This suggests more generally that a broader network is necessary for decision tasks requiring memory.

### Future prospects

During behavioral tasks, subjects are not always aware of their mistakes (Yeung and Summerfield, 2012), but do show PES. One may thus ask why one does not generally become aware that an error has been made, because the neural dynamics is different following an error or a success. As discussed in the present work, these differences in the dynamics are very subtle. The post-error and post-correct firing rates have broad distributions, with some common properties (the same mean for example). The strong overlapping of these distributions makes it difficult to infer the correctness of the decision on a single trial basis. Yet, the tails of the post-error synaptic distribution should allow in some cases to infer that an error has been made. It would be interesting to see in behavioral experiments whether the post-error effects can be related to the confidence in one's decision (Wei et al., 2015; Insabato et al., 2017).

## Footnotes

K.B. acknowledges a fellowship from the ENS Paris-Saclay. We thank Jerôme Sackur, Jean-Rémy Martin, and Laurent Bonnasse-Gahot for useful discussions, and the anonymous reviewers for helpful and constructive comments.

The authors declare no competing financial interests.

- Correspondence should be addressed to Kevin Berlemont at kevin.berlemont{at}lps.ens.fr