Abstract
For nearly 50 years, the dominant account of decision-making holds that noisy information is accumulated until a fixed threshold is crossed. This account has been tested extensively against behavioral and neurophysiological data for decisions about consumer goods, perceptual stimuli, eyewitness testimony, memories, and dozens of other paradigms, with no systematic misfit between model and data. Recently, the standard model has been challenged by alternative accounts that assume that less evidence is required to trigger a decision as time passes. Such “collapsing boundaries” or “urgency signals” have gained popularity in some theoretical accounts of neurophysiology. Nevertheless, evidence in favor of these models is mixed, with support coming from only a narrow range of decision paradigms compared with a long history of support from dozens of paradigms for the standard theory. We conducted the first large-scale analysis of data from humans and nonhuman primates across three distinct paradigms using powerful model-selection methods to compare evidence for fixed versus collapsing bounds. Overall, we identified evidence in favor of the standard model with fixed decision boundaries. We further found that evidence for static or dynamic response boundaries may depend on specific paradigms or procedures, such as the extent of task practice. We conclude that the difficulty of selecting between collapsing and fixed bounds models has received insufficient attention in previous research, calling into question some previous results.
Introduction
Over the past 50 years, psychology and neuroscience have contributed to a deeper understanding of decision-making based on “diffusion models” (Stone, 1960; Laming, 1968; Ratcliff, 1978). Diffusion models assume that noisy information is gradually sampled from the environment. The process continues until the balance of evidence reaches one of two decision boundaries, triggering a choice. In hundreds of human studies with thousands of participants, diffusion models have described data very accurately, providing insight into many theoretical and practical research areas, including decisions about consumer goods, memories, motion stimuli, clinical populations, aging, sleep deprivation, and psychopharmacology (Ratcliff, 1978; Krajbich et al., 2010; Krajbich and Rangel, 2011; Ratcliff and Van Dongen, 2011). More recently, the same models have been correlated with various components of neuroimaging and neurophysiological data in humans (Forstmann et al., 2008, 2010; O'Connell et al., 2012; Ratcliff et al., 2009; Schurger et al., 2012). In Macaca mulatta, the firing rates of some neurons seem to behave like processes from diffusion models (Bollimunta et al., 2012; Ding and Gold, 2010, 2012a; Hanes and Schall, 1996; Heitz and Schall, 2012; Kiani and Shadlen, 2009; Pouget et al., 2011; Purcell et al., 2010; Purcell et al., 2012; Ratcliff et al., 2003; Roitman and Shadlen, 2002; Woodman et al., 2008): during decision-making, the firing rate of these neurons increases until it reaches a threshold value and a behavioral response is initiated. The notion that firing rate in some cells might represent a decision threshold has even received independent support from neural studies of saccade production (Brown et al., 2008; Hanes et al., 1998; Paré and Hanes, 2003).
Diffusion models typically assume fixed decision boundaries; the amount of evidence required to trigger a decision does not change with time (dashed lines, left panel, Fig. 1). Recently, however, a more complicated assumption has gained popularity: collapsing boundaries (solid curved lines, Fig. 1), sometimes interpreted as urgency signals, where decisions are triggered by less and less evidence as time passes (Bowman et al., 2012; Cisek et al., 2009; Ditterich, 2006a,2006b; Drugowitsch et al., 2012; Milosavljevic et al., 2010; Thura et al., 2012). Figure 1 demonstrates how models with fixed and collapsing bounds make different predictions for response times. The collapsing bounds reduce the number of slow decisions, making response time distributions less skewed.
Left, Diffusion models with fixed (dashed) or collapsing (solid) decision boundaries. A model with collapsing boundaries can terminate the evidence accumulation process earlier than a model with fixed boundaries, resulting in faster decisions. Right, Graph showing how the models lead to different predictions for response time distributions.
For decades, theories using fixed bounds have provided precise accounts of many aspects of the data. It is not clear whether collapsing bounds and urgency signal models fit data quite as well. Nevertheless, some researchers have asserted that the nonstationary assumption is true; for example, “It turns out that the prediction was misguided. There is no reason to assume the terminating bounds are flat (i.e., constant as a function of elapsed decision time)” (Shadlen and Kiani, 2013). We addressed these problems in a large-scale survey and found, overall, evidence in favor of the fixed bound approach.
Materials and Methods
Model details.
We study diffusion processes through discrete-state approximations. The basic assumption of the diffusion process is that a decision-maker accumulates noisy evidence from the environment over time. The accumulated evidence evolves toward one of two decision criteria that correspond to the two response alternatives and when the process reaches one of these boundaries a response is triggered. The predicted response time is the sum of the time taken to reach the boundary and an offset time required for non decision-related components of choice, such as encoding the stimulus and executing a motor response. The predicted response corresponds to the boundary that was crossed.
The basic diffusion model (as just described) is governed by five parameters: the average rate at which the process drifts toward one boundary (drift rate, v); noise in the diffusion process (s, a scaling parameter not typically estimated from data and fixed in all model fits herein); the separation of the boundaries (a); the starting position of the diffusion process (z); and the time taken for nondecision processes (ter). This basic diffusion model for decision-making is over 50 years old Stone (1960). Modern accounts assume variability in three model parameters from decision to decision (reflecting, e.g., fluctuations in attention). Such variability addresses well known deficiencies in the basic diffusion model—most prominently, if the boundaries are equidistant from the starting point, then the predicted correct and error response times are identical Feller (1968). The three parameters assumed to vary from decision to decision are the start point
Collapsing bounds and urgency signal models.
The collapsing bounds model we analyzed was identical to the fixed bounds model just described except that the upper boundary decreased from its initial value (a) to an asymptotic value that was not forced to be the halfway point
Various forms of dynamic diffusion models. The Weibull cumulative distribution function can generate decision boundaries that collapse at early (left) or late (middle right) stages of processing or gradually throughout processing (middle left). Orange lines indicate strong (left) and complete (middle left, middle right) collapse of the upper and lower boundaries, which imposes a hard deadline on processing. Blue lines represent a milder form of collapsing boundary that do not meet at the midpoint . Right, Two illustrative urgency signal paths, which are applied as a gain (i.e., multiplicative) function on drift rates according to Equation 2. The orange and blue paths show rapid and delayed urgency signals, respectively, which are functionally similar to early and late collapsing boundaries.
The Weibull cumulative distribution function mimics parametric forms that might be proposed for collapsing bounds. Orange and blue lines show examples of candidate stationary and dynamic boundaries of various parametric functions: fixed, exponential, hyperbolic, and logistic. Overlaid black dotted lines indicate the best fitting Weibull function, which in all cases closely mimics the generating function.
In the collapsing bounds model, we assumed the upper threshold u at time t after onset of evidence accumulation would be as follows:
To all datasets, we fit four variants of the collapsing bounds model that corresponded to different psychological assumptions about the decision process. These assumptions informed choices about which parameters were freely estimated from data and which were fixed (not estimated from data). Our results primarily focus on the “late collapse” models in Figure 2, which freely estimated the stage of the decision process when boundaries began to collapse (λ, scale parameter of the Weibull distribution) and the extent to which the boundaries collapsed (a′, asymptotic boundary setting, Eq. 1). The shape parameter was fixed at k = 3—the value used in the middle right panel of Figure 2—to impose a “late collapse” decision strategy, and was also a representative value from preliminary model fitting when no constraint was placed on the shape parameter. The second model variant permitted freedom in whether the decision boundaries collapsed early or late in processing and in the rate of collapse (shape and scale parameters freely estimated from data) and fixed the asymptotic boundary setting to instantiate complete collapse, with the upper and lower boundaries meeting (a′ = 0). The third variant fixed the shape at k = 3 and imposed complete collapse on the asymptotic boundary (a′ = 0) and freely estimated the time at which the boundaries commenced their trajectory toward complete collapse. The final model freely estimated all three parameters, which allowed freedom in the rate and form of the collapsing boundary and in the extent of the collapse. The results described below were from the first model and they generalized across the four variants of the collapsing bounds model used.
We also considered a different kind of collapsing bounds model, instantiated via an “urgency signal” (Churchland et al., 2008; Cisek et al., 2009; Ditterich, 2006a; Hanks et al., 2014; Thura et al., 2012). In this model, the boundaries were fixed throughout the decision, but the evidence accumulation process was subject to a gain parameter with a value that increased with the duration of the decision (Fig. 2). The gain parameter was implemented as a three-parameter logistic function following Ditterich (2006a). The urgency signal γ at time t after the onset of evidence accumulation is given by the following:
where d is a delay parameter and sx and sy are shape parameters (Fig. 2).
We also ran our analyses twice: once with the fixed and collapsing bounds models described above and once with the data modeled as a mixture of those models (at 98%) and a random uniform process across the observed range of response times (at 2%). This mixture represents the idea that a small fraction of responses might be contaminants unrelated to the stimulus or the regular decision process. The two analyses agreed closely and below we discuss results for the contaminant mixture model, which can provide more stable parameter estimates in the presence of outlying data (Ratcliff and Tuerlinckx, 2002).
Datasets for modeling.
In a classic experiment, previously interpreted as supporting a fixed-bound diffusion model because the neural firing rates for fast and slow responses appeared to hit the same maximum level to trigger a decision, Roitman and Shadlen (2002) had two monkeys make decisions about random dot motion (RDM; Fig. 4). A RDM decision is based on a cloud of dots, of which a certain percentage move coherently toward the left or right of the screen while the remaining dots move randomly. The monkeys' task was to indicate the direction of coherent motion by making eye movements. The percentage of coherently moving dots was varied from trial to trial across six levels (0%, 3.2%, 6.4%, 12.8%, 25.6% and 51.2%). One rhesus monkey completed 2614 trials (Monkey B) and another completed 3534 trials (Monkey N).
Palmer et al. (2005), Experiment 1, replicated the experiment of Roitman and Shadlen (2002), but used six human participants. Methodological details were almost identical except for the duration of data collection sessions, the precise timing of feedback information, and the nature of rewards, all of which were tailored to human needs. The human participants each completed ∼560 trials.
Ratcliff and McKoon (2008), Experiment 1, replicated the experiment of Palmer et al. (2005) with 15 human participants. Methodological details were similar, except that motion coherence was varied from trial to trial across six different levels (5%, 10%, 15%, 25%, 35%, and 50%), responses were indicated by button presses rather than saccades, and feedback on correct versus incorrect decisions was provided immediately rather than after a short delay. The human participants each completed ∼960 trials.
Ratcliff et al. (2007) had two monkeys make decisions about patches of pixels that varied in brightness (Fig. 4). Monkeys were shown a square of black and white pixels that ranged across six levels in the proportion of black pixels (“bright”: 2%, 35%, 45%; “dark”: 55%, 65%, 98%). Responses were indicated by saccades. One rhesus monkey completed 12,021 trials and another completed 7632 trials.
Middlebrooks and Schall (2014) had two monkeys and eight humans make decisions about square 10 × 10 checkerboards that contained more cyan or magenta checkers, similar to the Ratcliff et al. (2007) brightness discrimination task. The percentage of cyan to magenta checkers was varied randomly from trial to trial and ranged across seven levels, determined separately for each monkey (Monkey B: 41%, 45%, 48%, 50%, 52%, 55%, 59%; Monkey X: 35%, 42%, 47%, 50%, 53%, 58%, 65%) and all humans completed the same set (35%, 42%, 46%, 50%, 54%, 58%, 65%). Responses were indicated by saccades. Perceptual categorization trials were randomly interleaved with stop-signal trials in which a signal would appear at a variable time after stimulus onset that indicated the participant must inhibit a prepared response. We only analyzed the no-stop trials for consistency with the experimental designs of the remaining seven datasets in our analysis. This gave 10,212 trials for Monkey B, 10,762 for Monkey X, and an average of 483 trials per human participant.
Ratcliff et al. (2001), Experiment 2 (young subjects), had humans decide whether the distance between two dots was small or large (Fig. 4). The separation of the dots ranged across 32 equal-sized steps from 11/16 inches (1.7 cm, “small”) to 1–5/16 inches (2.4 cm, “large”). Following Ratcliff et al. (2001), we collapsed data from the 32 levels with similar response time and accuracy performance into four conditions for model fitting.
In separate blocks, participants were instructed to respond as quickly as possible (speed-emphasis) or as accurately as possible (accuracy-emphasis). The 17 human participants each completed ∼1000 accuracy-emphasis trials, which contributed to the primary analyses, and ∼1000 speed-emphasis trials, which were used for a replication analysis. Responses were indicated by button presses.
Ratcliff et al. (2003) replicated Ratcliff et al.'s (2001) experiment but used two monkey participants. Methodological details were similar, except that, the duration of data collection was longer, dot separation ranged from 2 to 10° of visual angle in increments of 1°, there were no speed- or accuracy-emphasis instructions, and responses were indicated by saccades. One rhesus monkey completed 11,495 trials and another completed 5037 trials.
Experiment 1.
Thirty-nine undergraduates from the University of Newcastle, Australia, completed speeded decisions in an RDM task. Methodological details were similar to Palmer et al. (2005) except that motion coherence was manipulated across six different levels (0%, 2.5%, 5%, 10%, 20%, 40%) and responses were indicated by button presses. The human participants each completed 432 trials.
An additional 74 undergraduates from the University of Newcastle completed RDM decisions for the replication analyses reported. All methodological details were the same as the first experiment except for a delayed feedback procedure similar to that of Palmer et al. (2005) and Roitman and Shadlen (2002). In particular, after a response, participants did not receive feedback until at least 1 s (44 participants) or 2 s (30 participants) had elapsed since stimulus presentation.
Fitting diffusion models to data: parameter estimation and model selection.
We estimated parameters for each model separately for each individual participant using quantile maximum products estimation (QMPE) (Heathcote et al., 2002; Heathcote and Brown, 2004; similar to χ2 and multinomial maximum-likelihood estimation). The observed data were characterized by nine deciles, calculated separately for correct and incorrect responses. The QMP statistic is used to quantify agreement between the model and data by comparing the observed and predicted proportions of data falling into each interdecile bin. Because the urgency signal and collapsing bounds models do not have closed-form analytic solutions for their predicted distributions, we evaluated the predictions by Monte Carlo simulation using 10,000 replicates per experimental condition during parameter estimation and 50,000 replicates per condition for precisely evaluating predictions at the search termination point. To keep the model comparison fair, we used the same method for the fixed-bounds diffusion (even though closed-form solutions are readily available). To simulate the models, we used Euler's method to approximate their representation as stochastic differential equations, with a step size of 0.01 s fixed everywhere. In initial tests, we confirmed that our choice of step size made no difference other than slight linear inflation of the predicted response time distribution and that making the step size very small led to perfect agreement with the closed-form analytic solutions for the fixed-bound model (Brown et al., 2006).
We adjusted the model parameters to optimize goodness-of-fit using differential evolution methods (Ardia et al., 2013; Mullen et al., 2011). We also tested parameter optimization via particle swarm and simplex algorithms, but found that both approaches gave poorer model recovery performance. The model parameters for each participant and model were estimated independently. The mean drift rate parameter (v) was estimated freely for each difficulty level (coherence level in random dot motion tasks, or brightness level in brightness discrimination). For example, the Ratcliff and McKoon (2008) experiment had six coherence levels, so we estimated six drift rate parameters. Roitman and Shadlen (2002), Palmer et al. (2005), and Experiment 1 had RDM stimuli with a 0% coherence condition, for which we fixed the drift rate to zero. We similarly fixed the drift rate to zero for the 50% cyan/magenta stimuli in Middlebrooks and Schall's (2014) data. All other model parameters were estimated once for each participant and model combination (i.e., constant across difficulty levels): boundary height (a), start point of evidence accumulation (z), start-point variability (sz), nondecision time (ter), variability in nondecision time (st), and drift rate variability (η). For the urgency signal and collapsing bounds models, we estimated the three additional parameters described in the previous section. We set wide bounds on all parameters and ran 120 particles for 500 search iterations. We repeated this parameter estimation exercise three times independently for each model fit to each participant's data and chose the best set of parameters overall.
We used three different approaches to model selection to compare the goodness-of-fit for the fixed bounds, urgency signal, and collapsing bounds models. The fixed bound model is nested within both the other models, so it must always fit more poorly than either. The three model selection methods evaluated whether the improvement in fit observed for the urgency signal or collapsing bounds models was sufficient to justify their extra complexity. We used the Akaike Information Criterion (AIC) (Akaike, 1974), the Bayesian Information Criterion (BIC) (Schwarz, 1978), and nested model likelihood ratio tests (Edwards, 1992). The results of the three model selection methods were very similar, so we report only the BIC results below.
We used the estimated BIC values to approximate posterior model probabilities, which account for uncertainty in the model selection procedure. Assuming a uniform prior across the m = 3 models under consideration and that the data-generating model was one of those under consideration, the BIC-based approximation is given by Wasserman (2000) as follows:
We also repeated the estimation of approximate posterior model probabilities comparing all four versions of the collapsing bounds model with the fixed bound and urgency signal models (i.e., comparison of m = 6 models). This did not change the conclusions regarding which model structure provided the best account of each dataset (i.e., fixed bounds, urgency signal, or collapsing bounds), so we do not report it further.
Model recovery.
We observed that reliably discriminating between synthetic data generated by fixed bound versus collapsing bound models was not always easy. Prompted by this, we performed extensive model recovery analyses to ensure that our results were robust, sensitive, and unbiased and not due to artifacts of the parameter estimation routines. We used a model recovery procedure (Navarro et al., 2004; Wagenmakers et al., 2004) that involves simulating many datasets from one model (the fixed bounds model) and fitting those simulated datasets with both the fixed and collapsing bounds models. We then assessed the difference in BIC between the two fitted models for each simulated dataset. If model recovery is easy, then the fit of the data-generating model should yield a better BIC than its competitor model. This process was then repeated using simulated data from the collapsing bounds model. We did not repeat this procedure using the urgency signal model because this model was rarely preferred over the fixed or collapsing bounds models in data (3 of 93 subjects; Fig. 5) and the heavy computational burden.
For the simulation study, we used the first variant of our collapsing bounds models, which form the primary focus of our results where the shape parameter was fixed (k = 3). For each experiment under examination and separately for the fixed and collapsing bounds models, we averaged across the best fitting parameter estimates for each subject and used these to simulate 100 synthetic datasets from each model, with sample size equal to the sample size in the real datasets (with very large simulated sample sizes, model recovery was close to perfect). We ran the parameter estimation routine three times independently for each simulated dataset and chose the best set of parameters for each simulated dataset (i.e., the parameter set with the best value of the objective function). In total, the model recovery procedure required ∼10,000 parameter estimation exercises.
Results
Evidence for stationary and dynamic decision-making models
We conducted the largest survey of data and performed the most intensive analyses of fixed versus collapsing decision boundaries to date. We gathered nine datasets from eight studies of the sort that have been used to support dynamic decision-making models. We examined three distinct perceptual decision-making paradigms—random dot motion, brightness discrimination, and dot separation (Fig. 4)—with experiments conducted in four independent laboratories. In total, we used data from eight nonhuman primates (M. mulatta) and 85 humans, with >116,000 decisions in all.
The three decision paradigms from the human and nonhuman primate studies: random dot motion, brightness discrimination, and dot separation.
For an overwhelming majority of human participants (72 of 85), the data supported the fixed bounds model over both the collapsing bounds and urgency signal models. Of the remaining 13 participants, nine were best described by the collapsing bounds model and results for the other four were inconclusive. Data from the eight nonhuman primates painted a different picture, with four clearly supporting the collapsing bounds model and two monkeys each supporting the fixed bounds and urgency signal models.
The results can be further understood by examination of the differences between experiments, shown as stacked bar plots of cumulative posterior model probabilities separately for each subject in the top row of Figure 5. Four of the five experiments with human participants provided evidence in favor of the fixed bounds model, with just one experiment providing evidence in favor of the collapsing bounds model. That single experiment Palmer et al. (2005), with just six participants, resulted in the same number of participants classified as having collapsing bounds than all the other human experiments combined, despite those other experiments having 79 participants in total. It is more difficult to draw such conclusions from the monkey data, because only two monkeys participated in each experiment.
Top row, BIC-based approximations to posterior model probabilities in favor of the fixed bounds, urgency signal, and collapsing bounds models separately for the nine datasets. Shade represents the three models and columns represent individual subjects, with subject labels from the original reports below. Second row, Average decision boundaries for fixed and collapsing models for each experiment using parameters averaged over subjects. Third, fourth, and fifth rows, Choice probabilities and response times for both data and model fits, as quantile-probability plots. Panels show the probability of a correct response on the x-axes and response time (in seconds) on the y-axes. Green and red crosses represent correct and error responses, respectively, across experimental conditions. Vertical placement of the crosses show, for each condition, the 10th, 30th, 50th (i.e., median), 70th, and 90th percentiles of the response time distribution, aggregated across subjects. Predictions of the fixed bounds, urgency signal, and collapsing bounds models are overlaid on data as black lines. Bottom row, Results of model recovery simulation study. Histograms represent distributions of the difference in BIC values from data simulated from the fixed (gray histograms) and collapsing bounds (black histograms) models. Distributions that fall to the left of zero support the fixed bounds model and those to the right of zero support the collapsing bounds model. Crosses represent the corresponding BIC difference values from the model fits to data. Heading acronyms refer to the nine datasets: RS (2002), Roitman and Shadlen (2002); RCS (2003), Ratcliff et al. (2003); RHHSS (2007), Ratcliff et al. (2007); MS (2014)–M, macaques from Middlebrooks and Schall (2014); PHS (2005), Palmer et al. (2005); MS (2014)–H, humans from Middlebrooks and Schall (2014); RTM (2001), Ratcliff et al. (2001); RM (2008), Ratcliff and McKoon (2008); Experiment 1, Reported in this manuscript.
The second row of panels in Figure 5 shows the average boundaries estimated for the fixed and collapsing bound models in each experiment using parameters averaged across subjects. In the datasets that supported the dynamic models, the estimated boundaries for the fixed and collapsing bounds models were very different; the collapsing bounds partially truncate the slow right tails of the predicted response time distributions, reducing their skew. This is shown in the quantile-probability plots in Figure 5: those datasets that supported the use of urgency signals or collapsing bounds had response time distributions that were less skewed than those experiments that did not support the dynamic models.
In addition to the data reported above, we replicated Experiment 1 by collecting data from an additional 74 participants under different experimental conditions (withholding feedback for a short delay) and by reanalyzing data from 17 participants in Ratcliff et al. (2001) with different instructions (emphasizing decision speed, rather than accuracy). Of these 91 additional human datasets, 64 showed very strong evidence for the fixed bounds model (i.e., PBIC(Mfixed | D) ≥ .99), and an additional 12 showed strong support (i.e., PBIC(Mfixed | D) ≈ .95 − .99). Of the remaining 15 subjects, one showed strong preference for the collapsing bounds model and the other 14 showed no strong preference for any model.
Collapsing boundaries and urgency signals do not replace between-trial variability in parameters
The modern fixed bounds model assumes that the drift rate, start point of evidence accumulation, and nondecision time vary randomly across trials (Ratcliff and Tuerlinckx, 2002; Ratcliff et al., 1999). The inclusion of these variability parameters has a demonstrated history of improving the fit of the fixed bounds model to data. For example, Ratcliff (1978) demonstrated that between-trial variability in drift rate allows the fixed bound model to account for erroneous responses that are slower than correct responses. In contrast, slow errors naturally emerge from time-variant models with collapsing boundaries or urgency signals (Ditterich, 2006b) and therefore might provide an alternative conceptual grounding to the between-trial variability parameters of the conventional fixed bounds model.
We tested this assertion by comparing the fixed bounds model with between-trial variability in drift rate (η), start-point (sz), and nondecision time (st) to two models with no between-trial variability parameters (i.e., η = 0, sz = 0, st = 0): a collapsing bounds model with free parameters for the shape (k), scale (λ), and asymptotic boundary setting (a′) and an urgency signal model with delay (d) and two shape parameters (sx, sy). In this comparison, the three models have the same number of free parameters, so both AIC and BIC are equivalent to comparison directly on log-likelihood.
The stacked bar plots in Figure 6 show cumulative posterior model probabilities separately for each subject. Although the results are less clear than in the comparison above, the overall conclusion is unchanged by omission of between-trial variability parameters from the collapsing bounds models. As before, more human participants were best described by the fixed bounds model than by either the collapsing bounds or urgency signal models. Breaking the results for human participants down by experiment reveals that one experiment provides clear evidence in favor of the collapsing bounds or urgency signal model (Middlebrooks and Schall, 2014), two experiments provide mixed evidence (Experiment 1; Palmer et al., 2005), and the other two provide majority evidence in favor of the fixed bounds model (Ratcliff et al., 2001; Ratcliff and McKoon, 2008).
Approximations to posterior model probabilities in favor of the fixed bounds model with between trial variability parameters and the urgency signal and collapsing bounds models without between trial variability parameters. All details are as described for the top row of Figure 5.
For data collected from nonhuman primates, the analysis without between-trial variability for the collapsing bound and urgency signal models provided strong support for the fixed bound model. For six of the eight monkeys, the fixed bound model was preferred in this analysis. As for the humans, this reinforces the conclusion that our decision to include the between-trial variability parameters in the primary analysis did not unfairly penalize the collapsing bounds and urgency signal models.
Model recovery
We took care to ensure that our methods for evaluating evidence in favor of collapsing versus fixed bounds models were robust, sensitive, and unbiased. We independently conducted the parameter optimization routines for the fit of all three models to each subject three times. We used an identical model fitting procedure to examine model recovery for each experiment using 100 datasets generated from each of the fixed and collapsing bounds models with the averaged parameter values across subjects within each experiment. We did not run model recovery simulations for the urgency signal model because it received so little support from the data.
The bottom row of Figure 5 shows that, for many parameter settings, model recovery was excellent even for the sample sizes matching the real data. Gray and black histograms represent data simulated from the fixed and collapsing bounds models, respectively, using the mean parameter estimates from the fits of the models to data. Distributions that fall below zero (dashed vertical line) support the fixed bounds model and vice versa for values above zero.
Model recovery was perfect when the data-generating parameters—taken from the fits to data—differed substantially between the fixed and collapsing models. For example, with simulated datasets modeled after the experiments that supported the use of the collapsing bounds model (second row, Fig. 5; Palmer et al., 2005; Ratcliff et al., 2003; Roitman and Shadlen, 2002), the data-generating model was correctly identified 100% of the time (i.e., the gray distributions were below zero and black distributions above zero).
As expected, for those experiments in which the estimated parameters for the collapsing bounds model were similar to the fixed bounds model, datasets simulated from the collapsing bounds model are frequently classified as fixed bounds because the estimated boundaries for the collapsing bounds (second row, Fig. 5) were close to constant. The result is that the distributions of differences in BIC should all favor the fixed bounds model (i.e., fall below zero), because the collapsing bounds model is penalized due to its additional parameters, complexity that is not justified by the data generated from models with close-to-constant boundaries.
Discussion
Diffusion models using decision boundaries that do not change during a decision (fixed bounds) have provided detailed accounts of many aspects of decision-making data (Albantakis and Deco, 2009, 2011; Basten et al., 2010; Boucher et al., 2007; Fetsch et al., 2014; Hanes and Schall, 1996; Ho et al., 2009; Kiani and Shadlen, 2009; Lo and Wang, 2006; Mulder et al., 2012; Purcell et al., 2010; Purcell et al., 2012; Ramakrishnan et al., 2012; Ratcliff et al., 2003; Ratcliff et al., 2007; Ratcliff and Starns, 2013; Resulaj et al., 2009; Shankar et al., 2011). More complex nonstationary models with collapsing boundaries or increasing urgency signals have recently become popular, especially in some (Churchland et al., 2008; Ditterich, 2006a; Drugowitsch et al., 2012), but not all (Purcell et al., 2010; Purcell et al., 2012), neurophysiological studies of primates. The dynamic models implement a constantly changing decision strategy in which the quantity of evidence required to trigger a decision decreases with time. We conducted the first extensive investigation of the evidence for models of speeded decision-making with static versus dynamic response boundaries. Overall, data from nine experiments provided most support for the conventional, fixed bound model. We found evidence for collapsing boundaries or urgency signals for a small proportion of human subjects, but for most of the nonhuman primate participants (six of eight). A follow-up analysis using an alternative, simpler, specification for the collapsing boundary and urgency signal models did not reverse the conclusions for human participants, but did provide much stronger support for the fixed bounds model from the monkeys' data (Fig. 6).
Our results highlight the dangers of generalizing widely based on data from just one decision-making paradigm, species, or procedure, as has occurred from some proponents of dynamic diffusion models. Of the eight macaques that received very extensive practice, six exhibited strong support for dynamic bounds models in our primary analysis. Conversely, the fixed bounds model was strongly supported in three experiments in which participants completed only a single session (Experiment 1; Ratcliff and McKoon, 2008) or two sessions (Ratcliff et al., 2001) of participation. Practice is known to influence boundary settings in decision-making models (Balci et al., 2011; Dutilh et al., 2009; Starns and Ratcliff, 2010), so it is plausible that collapsing bounds models provide good descriptions of data from extremely highly practiced participants; however, this remains an open question. Indeed, there are a number of datasets in the literature that we did not explore that demonstrate a good fit of fixed bounds models to data from highly practiced participants (Ding and Gold, 2012b; Purcell et al., 2010; Purcell et al., 2012; Ramakrishnan et al., 2012).
Task practice may also influence decision strategy through an interaction with the procedure used to administer rewards. Roitman and Shadlen (2002) and Palmer et al. (2005) withheld feedback information (and its associated reward for correct responses for nonhuman primates) until a response was registered or a minimum time poststimulus onset had elapsed (1 s), whichever came later. This procedure rewards the decision-maker for withholding their responses on some trials to avoid premature errors. Collapsing response boundaries might implement an appropriate withholding strategy: evidence accumulation can proceed as in a fixed bound model until the minimum delay time has passed, after which the boundary rapidly declines. An urgency signal model can similarly explain these data: the urgency signal is weak at trial onset, but as the delay period narrows, the signal increases to quickly elicit a response once the delay window has passed. These hypotheses are supported by the evidence in favor of dynamic boundary models from the two experiments using delayed reward timing (Palmer et al., 2005; Roitman and Shadlen, 2002).
It appears that subjects might learn the delayed response strategy after considerable task practice. In Roitman and Shadlen (2002), nonhuman primates completed thousands of decision trials and all of the participants in Palmer et al. (2005) had previous experience with similar psychophysical experiments before participation in the target experiment. This strategy was not observed in our replication experiments, in which 74 humans completed a single session under conditions similar to Palmer et al. (2005) with a minimum delayed feedback time of 1 or 2 s. Those subjects provided strong support for the fixed bounds model.
Overall, our survey suggests that humans and nonhuman primates use similar approaches in speeded decision-making tasks, but that both species can adopt different strategies as a function of the task constraints imposed by the experimenter. Many factors might differentially influence the use of decision strategies, including training, time-on-task, instructions, and the procedures used for reward timing; future research is required to understand the relative contributions of these factors. By applying computational cognitive models, we can disentangle the strategic effects of speeded decision-making from those that reflect the accumulation of information. Such a model-based decomposition of performance in humans and nonhuman primates maximizes the opportunity to learn about similarities and differences in cognitive processes across species.
Footnotes
The authors declare no competing financial interests.
- Correspondence should be addressed to Dr. Guy E. Hawkins, Amsterdam Brain and Cognition Center, University of Amsterdam, Nieuwe Achtergracht 129, 1018 W Amsterdam, The Netherlands. guy.e.hawkins{at}gmail.com