Abstract
Recent work suggests that while animals decide between reaching actions, neurons in dorsal premotor (PMd) and primary motor (M1) cortex reflect a dynamic competition between motor plans and determine when commitment to a choice is made. This competition is biased by at least two sources of information: the changing sensory evidence for one choice versus another, and an urgency signal that grows over time. Here, we test the hypothesis that the urgency signal adjusts the trade-off between speed and accuracy during both decision-making and movement execution. Two monkeys performed a reaching decision task in which sensory evidence continuously evolves over the course of each trial. In different blocks, task timing parameters encouraged monkeys to voluntarily adapt their behavior to be either hasty or conservative. Consistent with our hypothesis, during the deliberation process the baseline and gain of neural activity in decision-related PMd (29%) and M1 cells (45%) was higher when monkeys applied a hasty policy than when they behaved conservatively, but at the time of commitment the population activity was similar across blocks. Other cells (30% in PMd, 30% in M1) showed activity that increased or decreased with elapsing time until the moment of commitment. Movement-related neurons were also more active after longer decisions, as if they were influenced by the same urgency signal controlling the gain of decision-related activity. Together, these results suggest that the arm motor system receives an urgency/vigor signal that adjusts the speed-accuracy trade-off for decision-making and movement execution.
SIGNIFICANCE STATEMENT This work addresses the neural mechanisms that control the speed-accuracy trade-off in both decisions and movements, in the kinds of dynamic situations that are typical of natural animal behavior. We found that many “decision-related” premotor and motor neurons are modulated in a time-dependent manner compatible with an “urgency” signal that changes between hasty and conservative decision policies. We also found that such modulation influenced cells related to the speed of the reaching movements executed by the animals to report their decisions. These results suggest that a unified mechanism determines speed-accuracy trade-off adjustments during decision-making and movement execution, potentially influencing both the cognitive and motor aspects of reward-oriented behavior.
Introduction
During natural behavior, animals are motivated to optimize their reward rate. To do so, they must find the best speed-accuracy trade-off (SAT) for both their decisions and their movements, and to adjust it to the current context. Recent studies examined how the SAT is adjusted during a variety of perceptual discrimination tasks. For example, several fMRI studies reported that time pressure leads to an increased BOLD response during baseline periods (Forstmann et al., 2008; Ivanoff et al., 2008; van Veen et al., 2008). Additional mechanisms for SAT adjustment have been identified at the level of single neurons. In the frontal eye fields (FEFs), Heitz and Schall (2012) found baseline changes as well as changes in neural gain and the onset time of perceptual processing. Baseline and gain changes were also reported by Hanks et al. (2014) in the lateral intraparietal area. These results are consistent with the proposal that SAT adjustment varies the baseline and drift rate parameters of evidence accumulation models (EAMs) (Bogacz et al., 2010a; Heitz, 2014; Standage et al., 2014).
All of these previous studies used perceptual discrimination tasks in which sensory information is constant within each trial. However, during natural behavior, animals are faced with a dynamic and continuously changing environment; and in such conditions, gradual accumulation of evidence is too sluggish to quickly respond to stimulus changes. Indeed, we recently found that, during deliberation in a dynamic reach-decision task, neural activity in monkey dorsal premotor (PMd) and primary motor cortex (M1) is not compatible with EAMs because it does not reflect integrated evidence (Thura and Cisek, 2014). Instead, consistent with the urgency-gating model (UGM) (Cisek et al., 2009; Thura et al., 2012), activity in both regions combines rapid estimates of evidence with a growing “urgency signal” related to elapsed time. Nevertheless, the way in which SAT is adjusted in this model may be closely related to what has been observed in prior studies. In particular, because the urgency signal effectively controls the dropping accuracy criterion that defines a subject's SAT policy during decision-making (Churchland et al., 2008; Standage et al., 2014), it could also be used to adjust the SAT when timing parameters of the task are varied in a block-dependent manner. Thura et al. (2014) reported behavioral evidence consistent with this proposal and also found that the same urgency signal that energizes the decision also appears to influence the vigor of the selected action, suggesting a unified mechanism of SAT adjustment for both decisions and actions.
Here, we test this hypothesis at the neural level by comparing activity in PMd and M1 while monkeys perform a dynamic decision task in two contexts: one motivating hasty choices and one motivating more conservative behavior. The UGM predicts that, when decisions are hasty, the urgency signal will increase the baseline and gain of both decision- and movement-related neural activity in the motor system. Some of these results previously appeared in abstract form (Thura and Cisek, 2011, 2012).
Materials and Methods
Subjects and apparatus.
Two male macaque monkeys (Macaca mulatta; Monkey S: 6 years old, 6 kg; Monkey Z: 5 years old, 6 kg) were implanted, under anesthesia and aseptic conditions, with a titanium head fixation post and recording chambers. The local animal ethics committee approved surgery, testing procedure, and animal care. Monkeys sat head-fixed in a custom primate chair and performed two planar reaching tasks using a vertically oriented cordless stylus whose position was recorded by a digitizing tablet (CalComp, 125 Hz). Their nonacting hand (Monkey S: left hand for ∼2 years, then right hand for 6 months; Monnkey Z: right hand for ∼1 year, then left hand) was restrained on an arm rest with Velcro bands. In some sessions, unconstrained eye movements were recorded using an infrared camera (ASL, 120 Hz). Stimuli and continuous cursor feedback were projected onto a mirror suspended between the monkey's gaze and the tablet, creating the illusion that they are in the plane of the tablet. Neural activity was recorded from the hemisphere contralateral to the acting hand with 1–4 independently moveable (NAN microdrive) microelectrodes (FHC), and data were acquired with the AlphaLab system (Alpha-Omega Eng).
Behavioral tasks.
Monkeys were trained to perform the “tokens” task (Fig. 1A) in which they are presented with one central starting circle (1.75 cm radius) and two peripheral target circles (1.75 cm radius, arranged at 180° around a 5 cm radius circle). The monkey begins each trial by placing a handle in the central circle, in which 15 small tokens are randomly arranged. The tokens then begin to jump, one-by-one every 200 ms (“predecision interval”), from the center to one of the two peripheral targets. The monkey's task is to move the handle to the target he believes will ultimately receive the majority of tokens. The monkey is allowed to make the decision as soon as he feels sufficiently confident, and has 500 ms to bring the cursor into a target after leaving the center. Crucially, when the monkey reaches a target, the remaining tokens move more quickly to their final targets (“postdecision interval,” which was either 50 or 150 ms in separate “fast” and “slow” blocks of trials, respectively). Once all tokens have jumped, visual feedback is provided to the monkey (the chosen target turns green for correct choices or red for error trials) and a drop of fruit juice is delivered for choosing the correct target. A 1500 ms intertrial interval precedes the following trial.
In both fast and slow blocks, the monkey is thus presented with a trade-off: either wait until the decision can be made with confidence, or guess ahead of time, which may not be as reliable but could yield potential successes more quickly (because of the acceleration of the remaining tokens). The trade-off can be adjusted between the two blocks. In particular, hasty decisions are more advantageous (in terms of reward rate) in the fast blocks than in the slow blocks because guessing quickly allows the monkey to save more time. Importantly, because all task events preceding target acquisition are identical between the two blocks, any differences in neural activity between the two blocks before target acquisition must reflect differences in the strategy voluntarily adopted by the monkeys.
The monkeys were also trained to perform a delayed reach task (usually 30–48 trials per session). In this task, the monkey again begins by placing the cursor in the central circle containing the 15 tokens. Next, one of six peripheral targets is presented (1.75 cm radius, spaced at 60° intervals around a 5 cm radius circle); and after a variable delay (500 ± 100 ms), the 15 tokens simultaneously jump into that target. This “GO signal” instructs the monkey to move the handle to the target to receive a drop of juice. This task is used to determine cell tuning as well as the animal's mean reaction time (RT), used as an estimate of the total delays attributable to sensory processing and response initiation.
Dataset.
The last stage of monkeys' training in the tokens task involved providing animals with alternating blocks of slow and fast trials (∼100–150 trials in a block) of the tokens task. Based on behavioral data (for a detailed analysis of the same animals' behavior, see Thura et al., 2014), we defined two periods during this last stage: first, when behavior was comparable between the two blocks; and second, when the monkeys began to behave differently in the two blocks, in terms of decision duration (DD) and success probability. Because the main goal of the present study is to explore neural correlates of monkeys' SAT adjustment between the blocks, data presented in this report only include trials performed during this final stage of training.
Neural recordings.
Our standard procedures for single-unit recordings in the PMd and M1, signal processing, and data management have been described previously (Thura and Cisek, 2014). During recording sessions, we focused on cells showing a change of activity in the tokens task, and monkeys were usually performing the task while we were searching for cells. When one or more task-related cells were isolated, we ran a block of 30–48 trials of the delayed reach task to determine spatial tuning and select a preferred target (PT) for each cell (i.e., the target associated with the highest firing rate during one or more task epochs). Next, we ran blocks of tokens task trials using the PT of an isolated cell and the 180° opposite target (OT). We sometimes simultaneously recorded several task-related cells showing different spatial preferences; and because we always selected a single pair of targets, the actual best direction for each of the recorded cells was not always among these two.
We usually started recording cells in the slow block because monkeys were more conservative in this condition (see Results). It was thus easier to assess cell properties online and more convenient to search for cells because fewer rewards were spent. When possible, cells were tested with multiple repetitions of slow and fast blocks to control for potential confounds related to evolving signals, elapsing time, and the monkey's fatigue or satiation.
Behavioral data analysis.
Methods to analyze monkeys' behavior in the tokens task have been described previously (Cisek et al., 2009; Thura and Cisek, 2014). Briefly, the tokens task allows us to calculate, at each moment in time, the “success probability” associated with choosing each target. To characterize the success probability profile for each trial, we calculated this quantity (with respect to the target ultimately chosen by the monkey) for each token jump (Fig. 1B). For example, if at a particular moment in time the right target contains NR tokens, the left contains NL tokens, and NC tokens remain in the center, then the probability that the target on the right will ultimately be the correct one (i.e., the success probability of guessing right) is as follows: Although each token jump and each trial were completely random, we could classify a posteriori some specific classes of trials embedded in the fully random sequence (e.g., “easy,” “ambiguous,” or “misleading” trials, Fig. 1C). RT was calculated as the time of movement onset (based on kinematics) and the time of the first token jump. Decision time (DT) was estimated by subtracting from the RT the monkey's mean RT from the delayed reach task performed on the same day. We could then compute for each trial the duration of a decision as well as its success probability at the time of the decision (Fig. 1B). Wilcoxon-Mann-Whitney (WMW) tests were used to compare RT, DD, or success probability distributions.
Calculation of the monkey's accuracy criterion at DT relies on the available sensory evidence at that time. Because it is unlikely that the monkeys could compute Equation 1, we calculated a simple “first-order” approximation of sensory evidence as the sum of log-likelihood ratios (SumLogLR) of individual token movements as follows: where p(ek|S) is the likelihood of a token event ek (a token jumping into either the selected or unselected target) during trials in which the selected target S is correct, and p(ek|U) is its likelihood during trials in which the unselected target U is correct. The SumLogLR is proportional to the difference in the number of tokens that have moved in each direction before the moment of decision (for more details on this analysis, see Cisek et al., 2009). To characterize the decision policy of a given monkey in a given block of trials, we binned trials as a function of the total number of tokens that moved before the decision, and calculated the average SumLogLR for each bin.
Above, we defined “sensory evidence” as the information, pertinent to the correct choice, which is continuously present within the stimulus (i.e., the number of tokens in each target). This is analogous to the motion signal continuously present within the stimulus of random-dot motion discriminations tasks (Britten et al., 1992), and to the total probability indicated by the visible symbols in the task of Yang and Shadlen (2007). In contrast, individual token jumps are novel pieces of information that change the sensory evidence, analogous to the appearance of a new symbol in the task of Yang and Shadlen (2007) or to each new auditory click in the click-counting task of Brunton et al. (2013). In previous publications, we suggested that the brain only integrates that novel information (Cisek et al., 2009; Thura et al., 2012). For example, we have recently shown that, during random-dot motion discrimination, the motion signal is not integrated over time, as usually assumed (Carland et al., 2015), and that what is instead integrated are the changes in the motion signal (Thura et al., 2012).
All arm and eye movement data were analyzed off-line using MATLAB (The MathWorks). Reaching characteristics were assessed using monkeys' movement kinematics. Horizontal and vertical position data were first differentiated to obtain a velocity profile and then filtered using a sixth-order low-pass filter with a frequency cutoff of 15 Hz. Onset and offset of movements were determined using a 3 cm/s velocity threshold. Peak velocity was determined as the maximum value between these two events.
Neural data analysis.
All neurophysiological data reported here were acquired from correct or error trials in which the monkeys completed the tokens task by choosing one of the two targets. To be included in the analyses, neurons had to be recorded during at least 100 trials in each of the slow and the fast blocks (in one or more repetitions of each block).
Neurons were selected according to their anatomical location and physiological properties (Fig. 1D). Among all cells recorded in PMd and M1, we first focus on those showing a significant spatial preference for one of the targets during the deliberation process (i.e., between the first token jump and our estimate of DT). In a recent study (Thura and Cisek, 2014), we showed that these cells reflect the deliberation process by tracking the evolution of sensory evidence and then signal the commitment to the choice ∼280 ms before movement execution. For each of these cells, we calculated the mean activity for each target choice during the 200 ms preceding DT in the tokens task and assessed the significance using a receiver-operating characteristic (ROC) (Green and Swets, 1966; Shadlen et al., 1996) analysis with a criterion of 0.65.
The effects of block on these decision-related cells (ROC > 0.65) were assessed via comparisons of averaged neural activity in various epochs of the task (see Results). In each individual cell, robustness of the effect of block was assessed with a WMW statistical test (which compares neural activity distributions). For a subset of cells, we investigated the relationship between sensory evidence (calculated as SumLogLR) and neural activity as a function of DD (in bins of 200 ms corresponding to the token jumps) in the two blocks of trials. Spearman's rank test was used to assess significance of the relationship between SumLogLR and neural activity, and analysis of covariance (ANCOVA) was used to assess the interaction between blocks and DD on neural activity. From this analysis, we focused on the evolution of the averaged activity in PMd and M1 for the condition when SumLogLR = 0 (equal evidence for each target) as a function of time (after each token jump, every 200 ms bin) in either the slow or the fast blocks. The average activity was calculated by averaging means across cells for bins in which data were available (even-numbered jumps). For odd-numbered jumps, the mean activity at zero evidence was estimated for each cell via interpolation of data surrounding SumLogLR = 0 and the population average activity was computed by averaging these means. To assess the robustness of the effect of block on the population activity at SumLogLR = 0, we performed a bootstrap test for bins in which data at this time were available (even-numbered jumps). The bootstrap procedure consisted of resampling the firing rate data of each cell 10,000 times in each bin and in the two block conditions to produce distributions of means and a distribution of the difference in the means. If zero lies outside the 2.5%–97.5% percentiles of the distribution of resampled differences, then the effect is considered significant at p < 0.05.
Next, we compared neural activity in the two blocks of trials in all cells showing a significant correlation (Pearson's linear correlation coefficient) between (1) firing rate and DD and (2) firing rate and peak velocity of the reaching movement (see Results). Average neuronal activity correlated with DD or movement peak velocity was computed for each block and compared in two groups of trials defined by the DDs. As above, significance of the effects was assessed by a bootstrap procedure.
Instantaneous firing rate was assessed via a partial interspike interval method. When analyzing data with respect to the start of the trial (first token jump), we always exclude all spikes occurring after our estimate of DT (i.e., any activity associated with movement initiation and/or execution). This is important to prevent averaging artifacts due to the very wide range of DDs in the tokens task.
The significance level of all statistical tests was set at 0.05.
Results
Effect of block on monkeys' behavior
The primary goal of the present paper is to assess the neural mechanisms of the SAT adjustments induced by the timing parameters of the task. After extensive training, both monkeys adapted their behavior as a function of the block (“slow” or “fast”). In a recent report, we described their behavior in detail both during and after they adapted their decision and movement policies (Thura et al., 2014). Here, we focus our work on data collected after each monkey showed strong and consistent behavior that differed in the two block conditions. Typically, this adaptation consists of a significant reduction of DDs and a decrease of success probabilities in fast blocks compared with slow blocks. Figure 2A illustrates an example session of Monkey Z's behavior as a function of the block condition.
Across all trials during which neurons described in the present paper were recorded, Monkey S and Monkey Z made decisions on average 491 and 534 ms faster in the fast blocks compared with the slow blocks, respectively (WMW test, p < 0.001 for both monkeys). This time savings was accompanied by a reduction of the overall performance in the fast blocks compared with slow blocks, for both monkeys (73.9% vs 79.9% of correct choices for Monkey S; 70.7% vs 76.9% of correct choices for Monkey Z).
This trade-off was very consistent between sessions. Figure 2B shows, for both monkeys and for all sessions during which neurons are analyzed below (49 sessions in Monkey S, 81 sessions in Monkey Z), the mean RTs in the fast versus the slow block condition. Both animals made decisions significantly earlier in fast blocks compared with slow blocks in all recording sessions (WMW test, p < 0.05), except for 2 of 49 and 3 of 81 sessions in Monkey S and Monkey Z, respectively. As expected, in most of these sessions, success probability as well as performance were lower during fast blocks compared with the slow blocks (Fig. 2C,D). Thus, monkeys traded speed for accuracy between the blocks. Did they also accomplish this adjustment within blocks?
We estimated the “accuracy criterion” for committing to a choice in a given block by computing, for each monkey and each block type, the available sensory evidence for the chosen target at the time of the decision as a function of DD (see Materials and Methods) (Cisek et al., 2009). Figure 3A shows that, for both animals and in both blocks, the average level of sensory evidence at the time of commitment first increases and then (after ∼900 ms) decreases as a function of DD. Although this pattern appears complex, it can be easily explained with a purely linear urgency signal with intertrial noise: Early decisions (<900 ms, 23% in Monkey S, 38% in Monkey Z) occur only when noise happens to be high, leading to an early guess even when evidence is low; hence, the early part of the curve is low and rises with time. Later decisions are more common (77% and 62% in Monkey S and Monkey Z, respectively) and the mean evidence level provides an accurate estimate of the animal's “accuracy criterion” at that time. Thura et al. (2014) simulate these data (Fig. 3A, dashed lines) using a UGM with the purely linear urgency signals shown here in Figure 3B.
An accuracy criterion that drops over time implies that, if sensory information is strong, the monkeys usually decide quickly. If information is ambiguous, they wait to see if it improves; and finally, if too much time has passed, they make a guess. We found that, for both monkeys, the estimated urgency signal is significantly higher during fast blocks than slow blocks. This suggests that, as expected, the monkeys are more willing to guess in the fast blocks and wait longer to decide in the slow blocks.
As stated in Materials and Methods, monkeys usually started each session with the slow block of trials. It is thus conceivable that the effects described above (faster RTs and lower accuracy in fast blocks) are related to fatigue or to a decreasing attention/motivation over the course of the session. To control for these potential confounds, we tested animals in sessions in which two blocks of slow trials surrounded a fast period. The results illustrated in Figure 3C show that Monkey S recovers almost entirely a slow-type pattern in the second of two slow blocks (following the fast block), ruling out the possibility that effects of blocks are due to factors such as motivation or satiety. Monkey Z shows the same trend but without a complete recovery. In the second slow block (cyan line), he was usually a little more willing to tolerate less sensory evidence and decide faster than in the first slow block (blue line). Nevertheless, his behavior was still significantly different and more conservative than in the preceding fast block (red line).
Overall, these behavioral data show that the context of the task (fast vs slow block) strongly affects monkeys' behavior and that they voluntarily establish (within a given block) and adjust (between blocks) their SAT. For both monkeys, the urgency signals derived from their choice behavior grow as a function of time in both blocks, but the initial level of urgency is higher in the fast blocks and then converges to the level of the slow blocks late in the trials. This makes good sense because the difference in the time savings afforded by each block is smaller as the number of tokens remaining in the center diminishes with time.
Can we find a neural signature of these patterns of urgency in the activity of PMd and M1 cells during decision-making and movement execution? In particular, the UGM predicts that, for a given level of sensory evidence, neural activity should be higher in the fast than the slow blocks at the start of each trial, then gradually rise over time but at a slower rate in fast blocks, eventually converging to the same level in both blocks (Fig. 3B). Furthermore, assuming that commitment is determined through a competition within PMd and M1 (Cisek, 2007; Thura and Cisek, 2014), neural activity at the time of commitment should be the same across blocks as well as across trials.
Decision-related neural activity in PMd and M1
Here, we analyze the activity of 211 PMd (131 in Monkey S) and 175 M1 (68 in Monkey S) cells that were recorded in at least 100 trials during each of the slow and fast blocks (Table 1). Because the results of all analyses were qualitatively the same regardless of which arm the monkey used to perform the task, we combined all results across neurons in both hemispheres in each area. Most of the neurons were recorded while monkeys performed a slow block followed by a fast block (371 of 386 cells). To control for potential confounds related to evolving signals or motivation, 228 of these cells were tested in at least one additional block of slow trials following the fast block; and of these, 55 cells were tested in two or more repetitions of both slow and fast blocks.
First, we compared neuronal activity between slow and fast blocks in PMd and M1 cells that discriminate the chosen target during the deliberation process. These cells were the subject of a recent paper showing that they are strongly involved in decision deliberation as well as choice commitment (Thura and Cisek, 2014). Here, we study these “decision-related” cells (PMd, n = 61; M1, n = 78) tested in enough trials in the two blocks, both before (baseline) and during the deliberation process. First, we found that average activity patterns during easy, ambiguous, and misleading trials (Fig. 1C) performed in the slow blocks replicate our previous observation: these cells track the evolution of sensory evidence guiding the choice and reach a peak of activity before the onset of movements (∼280 ms in PMd, ∼140 ms in M1) expressing that choice (Fig. 4). Next, we found that, overall, a large majority of these PMd (44 of 61, 72%) and M1 (58 of 78, 74%) cells were significantly modulated by the block condition before and/or during the deliberation process (significant WMW test on at least one of the epochs described below). In the majority of cases (30 of 44, 68% in PMd; 47 of 58, 81% in M1), activity was higher in the fast blocks compared with the slow blocks. Examples of such block-sensitive cells in both PMd and M1 are illustrated in Figure 5. Figure 5A, B shows an example M1 cell that was tested in two repetitions of slow and fast blocks. Baseline activity is unaffected by these changes, but the cell's response to token jumps is strongly influenced by the block condition, regardless of the ultimate direction chosen by the monkey. It exhibits a strong burst in the fast blocks, whereas it only slowly rises in the slow blocks. The PMd cell depicted in Figure 5C, D shows a similar effect. Here, activity is modulated by the block condition both during the baseline and the deliberation period. Furthermore, unlike the M1 cell shown above, in the second block of slow trials, the activity of this PMd cell does not come back to the same level observed during the first slow block, especially for the baseline period. This may reflect the tendency of monkeys to become hastier over the course of a daily session, as depicted above in Figure 3C, perhaps reflecting gradual devaluation of reward.
To quantify the effect of block at the population level, we first examined activity during a baseline period defined as the 200 ms preceding the first token jump. For this analysis, both trials in which monkeys chose the cell's preferred (PT) or nonpreferred target (OT) were included, as no effect of evidence could affect activity at that time. The average response of the 61 “decision-related” PMd cells in both blocks is shown in Figure 6A (left). This plot shows that during fast blocks, activity is slightly increased compared with slow blocks. A cell-by-cell analysis revealed that approximately half of the cells (34 of 61, 56%) were significantly modulated by the block condition. Among these, we found that 22 (65%) had a significantly higher baseline activity in the fast blocks compared with the slow blocks (Fig. 6B, left). In M1, baseline activity was also modulated as a function of the block condition in more than half of the cells (46 of 78, 59%), with activity usually stronger in fast blocks (35 of 46, 76%; Fig. 6D, left), resulting in an overall slight increase of baseline activity during fast blocks (as shown on the M1 population average response in the left panel of Fig. 6C).
Next, we assessed the effect of block on neural activity during the deliberation process (discarding all activity after DT to prevent contamination by preparatory and execution processes). As trials were highly variable in terms of sensory evidence provided to the animals, we classify them according to their success probability profiles and focused our analyses on easy, ambiguous, and misleading trials (see Materials and Methods). In a previous report, we showed that both PMd and M1 cells reflected the evolution of the sensory evidence used by the monkey to make a choice. Indeed, regardless of the block, Figure 6A, C (three right panels) clearly shows that cells quickly discriminate the cell's PT in easy trials, discrimination takes longer in ambiguous trials, and in misleading trials, a switch of activity is observed, reflecting the inversion of success probability in such trials.
What happens when we compare this evidence-related activity between the slow and fast blocks? On average, population activity was strongly amplified during fast blocks compared with slow blocks, especially in easy and misleading trials, as illustrated in Figure 6A, C for both populations. In ambiguous trials, modulations appeared to be weaker at the population level, especially in PMd (Fig. 6A, third panel from left). Nevertheless, these results suggest that these cells are influenced by both the sensory evidence and by a block-dependent signal.
Next, we conducted a cell-by-cell analysis of all neurons recorded in at least 5 trials of a given trial type in each of the two blocks. To choose the time epochs for this analysis, we first visually inspected the time period during which population responses reflected trial types, and then made sure that these periods preceded decision commitment (Fig. 6A,C). Thus, for easy trials, we compared activity during a 200 ms time window shortly following the deliberation onset (from 200 to 400 ms after the first token jump); for ambiguous trials, we focused on the ambiguous epoch (from 400 to 600 ms after the first token jump); and for misleading trials, we focused on the epoch when neural activity tended to favor the target opposite to each cell's PT (from 700 to 900 ms after the first token jump). Given these constraints, only PT-related easy trials with DDs > 400 ms were considered (PMd: n = 40; M1 = 50); for ambiguous trials, both PT and OT-related trials with DDs > 600 ms were considered (PMd: n = 47; M1 = 63); and for misleading trials, only OT-related trials with DDs > 900 ms were considered (PMd: n = 52; M1 = 67).
Results depicted in Figure 6B, D (three right panels) show that a subset of PMd cells (6 of 40, 15%) were significantly modulated by the block condition shortly after deliberation onset during easy trials. Among them, all had significantly higher activity in the fast blocks compared with the slow blocks. In M1, activity during easy trials was also modulated as a function of the block condition in a subset of cells (9 of 50, 18%), with activity also almost always stronger during fast blocks (8 of 9, 89%). In ambiguous trials, 10 of 47 of the PMd cells and 9 of 63 of the M1 cells were significantly modulated by the block, usually (8 of 10 in PMd; 7 of 9 in M1) showing an increase of activity in the fast blocks compared with the slow blocks. Finally, during misleading trials, one-fourth of PMd cells (13 of 52) were significantly modulated by the block condition during the tested 200 ms epoch. Among them, we found that 9 of 13 (69%) had significantly higher activity in the fast blocks compared with the slow blocks. In M1, 19 cells were modulated by the block condition in misleading trials (28%), with activity usually stronger in fast blocks (16 of 19, 84%).
Because neural activity related to cells' PT peaks close to commitment time (Fig. 4), the higher neural activity usually observed in the fast blocks could reflect the rise of this activity to the peak in trials in which choice commitments occurred close to the analysis window. Although this cannot be responsible for the large effect of block that we observed on baseline activity in PMd and M1, it could potentially confound activity during deliberation, especially in easy trials. To control for this, we performed additional comparisons where we restricted our analyses to trials in which RT was much longer than the analysis windows (RTs >1000 ms and RTs >1200 ms). In both cases, we found similar results (data not shown), although the effects at the level of many individual cells did not meet significance because of the large number of discarded trials.
Another important control consists of assessing the effects of recording stability and/or monkeys' evolving motivation/fatigue during the time course of a recording session. As neurons were usually recorded in a slow block of trials first, the observed amplification of signals in fast blocks could reflect factors other than SAT adjustment. To examine this issue, most of the decision neurons in PMd and M1 were tested in a second block of slow trials following the fast block. If modulations of activity are related to SAT adjustment, then activity should recover a slow-type pattern after the fast condition. Figure 7 clearly shows this result. In Figure 7A, C (for PMd and M1 cells, respectively), we compare activity between a slow block that preceded a fast block. As noted above, we observe an amplification of activity in the fast block. In Figure 7B, D, the comparison is between a second slow block that followed a fast block. In this second block of slow trials, the activity is reduced in comparison with the fast block, consistent with the proposal that the amplitude changes are primarily attributable to SAT adjustments.
The results described above suggest that the activity of a large population of PMd and M1 decision-related cells is amplified during fast blocks, when animals favor decision speed over accuracy. But is activity during the early part of the deliberation process a good predictor of monkeys' subsequent RTs within blocks? To answer this question, we performed a trial-by-trial correlation between DD and activity related to cells' PT from 200 to 500 ms following the first token for each PMd and M1 decision-related cell in separate slow and fast blocks. We found that, for a subpopulation of cells (PMd: 18% in slow block, 23% in fast block; M1: 28% in slow block, 19% in fast block), early deliberation-related activity is significantly correlated with DD. As expected, correlations are usually negative, meaning that the stronger the firing rate, the shorter the decision. Although these percentages are not negligible and may implicate trial-to-trial variability in urgency as a factor influencing RTs, it is probably only one of many other factors (e.g., sensory evidence, fluctuations in attention) that determine the RT on any particular trial.
Using the UGM, we can estimate the shape of the urgency signal in the two blocks on the basis of each animal's behavior (Fig. 3B) (for details, see Thura et al., 2014). Is it possible to observe a neural signature of these shapes from the block-modulated cells? To answer this question, we looked at the correlation between cell activity and sensory evidence (calculated as the SumLogLR) as a function of time in each of the two blocks (Fig. 8). As previously reported (Thura and Cisek, 2014), both PMd and M1 cells are positively correlated with sensory evidence, from the beginning of the deliberation until the time of commitment. Here, we focus on cells that were significantly modulated by the block condition, with firing rate higher in the fast blocks compared with the slow blocks, during at least one of the tested periods (baseline period in all trials, 200–400 ms in easy trials, 400–600 ms in ambiguous trials, 700–900 ms in misleading trials; PMd, n = 30; M1, n = 47). The comparison of the correlation in the two conditions shows, as expected, that both PMd and M1 activity is usually stronger in the fast blocks than in the slow blocks (Fig. 8A,C). More interestingly, if we examine the neuronal activity at the zero evidence point, and then plot it as a function of time in the two conditions (Fig. 8B,D), we observe that, for PMd cells, activity is growing regardless of the block (ANCOVA, time, F = 23.9, p < 0.0001) and is higher in the fast blocks (ANCOVA, block, F = 7.41, p = 0.0067), especially early in the trial. Nevertheless, the interaction between time and block is not statistically significant (ANCOVA, time × block, F = 0.77, p = 0.38). This is reminiscent of the urgency shapes we derived from monkeys' behavior. In M1, we observe a similar effect (ANCOVA, block, F = 10.6, p = 0.0012; time × block, F = 0.87, p = 0.35), although the activity does not grow over time as much as it does in PMd (ANCOVA, time, F = 9.7, p = 0.0019).
We performed several control analyses to assess the robustness of these important results. First, we normalized activity of each cell (based on the mean firing rate computed across blocks in a 60 ms window around commitment time, i.e., 310–250 ms before movement onset) before averaging across populations. The results (data not shown) were qualitatively the same as without normalization. Next, as for the analysis depicted in Figure 6, we controlled for a potential influence of commitment-related activity on the observed modulations. We performed the same analysis using only trials with RTs between 1000 and 1500 ms in both blocks, to see if activity early in the trials (i.e., far from decision commitment) was still modulated by the SAT condition. We found that, even in this highly constrained condition, activity at zero evidence still rises with time and is still higher in the fast than the slow blocks (ANCOVA shows all effects described above significant at p < 0.05). Finally, to verify that both monkeys generalized SAT performance across limbs, we did that analysis again by grouping cells according to the hemisphere recorded. The results show that modulation by the time- and block-related signal is valid for both PMd and M1 populations regardless of the hemisphere sampled (especially in Monkey Z in which many modulated cells were collected in each hemisphere). Together, these results suggest that both PMd and M1 cells are influenced by the evolving evidence and by a context-dependent, time-related signal, in agreement with the UGM (Cisek et al., 2009; Thura et al., 2014).
Decision-related neural activity in PMd and M1 at commitment time
In a previous article (Thura and Cisek, 2014), we demonstrated that, when the activity of decision-related cells is aligned on movement onset, PMd activity related to the chosen option reaches a peak ∼280 ms before movement initiation, whereas M1 activity follows with a peak ∼140 ms before movement onset. These peaks of activity cannot be related to saccade initiation because, in most trials (74%–79%), both monkeys were already looking at the selected target long before the time of the peak (Thura et al., 2014). We have previously proposed that this peak of activity reflects the monkey's volitional commitment to an action choice (Thura and Cisek, 2014).
Figure 9A illustrates activity aligned on movement for the example PMd and M1 cells described above in Figure 5. For both cells, PT-related activity reaches a peak before movement, whereas OT-related activity is suppressed. As shown in Figure 5, during deliberation the activity of both cells was strongly amplified in the fast blocks. However, when aligned on movement onset, we observe that both cells reach a similar peak of activity regardless of the block. This suggests that the setting of the SAT does not affect the neural threshold that determines commitment to a choice, although it influences the neural activity that leads to the crossing of that threshold.
Consistent with this proposal, the average population activity of both PMd and M1 cells aligned on movement onset is indeed very similar in the two blocks of trials (Fig. 9B). Nevertheless, when we look at the effect of block on individual PMd cells during a 200 ms period around the average peak time (280 ms before movement), we do not observe an absence of the block effect for all cells: 33 PMd cells (54%) were significantly modulated (WMW test, p < 0.05), showing either a stronger (20 of 33, 61%) or a weaker peak activity in fast blocks compared with slow blocks (Fig. 9C, top). These balanced modulations are observed despite the fact that RTs were almost always shorter in fast blocks compared with slow blocks in the corresponding recording sessions (Fig. 2B). We conducted the same analysis in M1, comparing activity in a 200 ms period around the average peak time (140 ms before movement). We also observed a significant proportion of modulated (32 of 78, 41%) cells, with a balance between increased activity in the fast (59%) or slow blocks (Fig. 9C, bottom). Again, RTs during these recording sessions were almost always shorter in the fast than the slow blocks (Fig. 2B).
Time-related neural activity in PMd and M1 at decision commitment
The results described above suggest that a context-dependent, time-related signal influences the neural activity responsible for choice deliberation and commitment in PMd and M1. So far, we only looked at a very specific population of “decision-related” cells: those that are significantly tuned during the deliberation process. But how global is this time-related signal? Does it also influence cells that are not necessarily involved in the deliberation and commitment process? To examine this, here we consider all of the PMd and M1 cells we recorded in the two blocks of trials (PMd, n = 211; M1, n = 175), including “decision-related” cells as well as cells that are not significantly tuned before decision commitment. To assess the effect of time on cells' activity, we calculated for each of them a trial-by-trial coefficient of correlation (Pearson product-moment correlation coefficient r) between DD and the average firing rate in a 200 ms period just before choice commitment (480–280 ms before movement onset). We found that 30% of PMd (63 of 211) and 30% of M1 (53 of 175) cells were significantly modulated by DD (Fig. 10A,B). These modulations could be either positive (the longer the decision, the higher the firing rate) or negative. Among such time-dependent cells, some belong to the decision-related cells described above, with a significant modulation of activity by sensory evidence (18 of 211, 9% in PMd; 25 of 175, 14% in M1). According to the scatter plot in Figure 10A, B, it appears that effects of time and evidence are distributed among a continuum of cells in both PMd and M1. Figure 10C illustrates an example cell recorded in PMd whose activity is strongly modulated by sensory evidence, but not by elapsing time, whereas Figure 10D shows a PMd cell less modulated by sensory evidence, but strongly affected by elapsing time.
Does this influence of DD reflect only elapsing time or is it also context dependent, like the putative urgency signals shown in Figure 3B? For these time-related activities, we can assess the level of activity preceding DT and compare it between the two blocks. If this time-dependent signal is indeed related to the urgency signal of our decision mechanism, we should observe stronger activity after longer DDs regardless of the blocks (because urgency increases with time in both conditions) and higher activity in the fast block compared with the slow block, especially for the shortest decisions. While the evidence-related cell (weakly affected by DD) shows only a weak trend of time-related modulation that is not context-dependent (Fig. 10E), the pattern of activity of the time-related cell strongly supports our hypothesis (Fig. 10F). It shows a strong context-dependent effect: for the same DD range, activity is higher in the fast condition compared with the slow condition, particularly for the shortest decisions.
Next, we tested this prediction at the population level, following the same logic: if the population of time-sensitive cells are influenced by the urgency signals estimated from monkeys' behavior (Fig. 3B), then we should observe three trends for those cells that are positively correlated with DD: (1) activity at DT should increase with DD, regardless of the block; (2) activity preceding short decisions in the fast blocks will be higher than activity preceding short decisions in the slow trials; but (3) activity should be similar for the longest decisions in the two blocks because the urgency signal tends to converge in the two blocks for long decisions (Fig. 3B). Furthermore, we should observe the opposite pattern for cells that are negatively correlated with DD.
We performed two analyses to test these predictions at the population level. First, we took advantage of the large variability of animals' RTs in the tokens task (Monkey S: 274–2773 ms; SD: 503 ms; Monkey Z: 249–2851 ms; SD: 583 ms) to divide the trials (for each cell) into two equal size groups based on the median DD, regardless of the block condition. Figure 11B, G illustrates for each group of trials performed in both blocks the average firing rate aligned on movement onset across all PMd (top panels) and M1 (bottom panels) cells, which were either positively (Fig. 11B) or negatively (Fig. 11G) correlated with DD in a 200 ms epoch preceding decision commitment (Pearson coefficient of correlation, p < 0.05). These plots show that PMd and M1 time-sensitive cells appear to reflect the influence of the context-dependent urgency signals estimated from monkeys' behavior: neural activity related to the shortest decisions tends to be higher (lower) in the fast blocks compared with the slow blocks for cells that are positively (negatively) correlated with DD, whereas activity related to the longest decisions tends to be similar in both blocks. To control for the possibility that these effects are driven by neurons with high firing rates, we performed a normalization of each cell activity (based on the activity of each cell around commitment time) before averaging across populations. The results (data not shown) were qualitatively indistinguishable from those obtained when the raw firing rate was considered.
As illustrated in Figure 11A, F, monkeys' behavior was highly consistent during sessions in which these cells were recorded: average sensory evidence at commitment time was higher in slow blocks compared with fast blocks, especially for the shortest decisions, suggesting that urgency was initially lower but grew faster in slow blocks compared with fast blocks. To better demonstrate this link between behavior and neural activity, we calculated for each group of trials and in each block the average firing rate of the cells in the 200 ms period preceding our estimate of DT and we then averaged this activity across all the PMd or M1 time-modulated cells. The results are illustrated in Figure 11C, H. In PMd, activity is significantly modulated by DD in the slow blocks, both for the positively and negatively correlated cells (17 vs 26 Hz, CI from bootstrap test, p < 0.0001 and 19 vs 14 Hz, p < 0.0001, respectively), whereas the effect of duration is weaker in the fast blocks but still very significant. The effect of block is stronger (p < 0.0001) for the shortest decisions than for the longest ones, for both PMd populations, as predicted by the urgency model. In M1, we found similar qualitative results for both cells positively and negatively correlated with DDs. Overall, these data are consistent with the hypothesis that activity is influenced by the urgency signals estimated from behavior (Fig. 3B).
Next, we performed a within-block correlation analysis between DD and firing rate in a 200 ms window preceding decision commitment (illustrated in two example cells in Fig. 10E,F) for each of the PMd and M1 cells positively or negatively correlated with DD. As expected, we found a significant correlation between activity and DD in all cells when the analysis is conducted in the slow blocks (this trivial significant correlation was indeed the criterion for including a cell in this analysis). The percentage of cells showing a significant correlation between these two variables was lower in the fast condition. More interestingly, we found that, in both areas, neural activity before commitment was modulated as predicted by the UGM: for the positively (negatively) correlated cells, the regression slopes are usually positive (negative) and steeper in the slow blocks and intercepts are higher (lower) in the fast blocks (Fig. 11D,E for the PMd and M1 positively correlated cells; Fig. 11I,J for the negatively correlated cells).
Velocity-related neural activity in PMd and M1 before movement execution
Finally, we looked at another type of cell, usually not significantly tuned during the deliberation process but tuned just before movement onset. The premovement activity in these PMd and M1 cells is occasionally correlated with the speed of the upcoming reaching movement (Moran and Schwartz, 1999; Churchland et al., 2006). Figure 12A illustrates an example speed-sensitive cell (correlation between reach peak velocity and PT-related activity in the 300 ms epoch preceding movement onset, r = 0.47, p < 0.001) recorded in M1. Interestingly, when sorted as a function of DD, the speed-related premovement burst is stronger after longer decisions (correlation between DD and PT-related activity in the 300 ms epoch preceding movement onset, r = 0.47, p < 0.001). Consistent with the assumption that this burst influences the upcoming movement speed, we observed that the longest decisions tend to lead to the fastest movements (r = 0.4, p < 0.05; Fig. 12B).
In a previous report (Thura et al., 2014), we showed that the reaching movements executed by the monkeys to report their decisions were shorter in duration after longer decisions. We also showed that this effect was block dependent, as the range of the duration reduction was bigger in the slow blocks of trials than the fast blocks and the difference was larger for the shortest decisions. Because this pattern is again reminiscent of the urgency signals estimated from choice behavior, we hypothesized that the same urgency signal that pushes animals to make their decision also influences how they execute their reaching movements to report these decisions (for more details, see Thura et al., 2014).
The manner in which monkeys reduced the duration of movements was complex and idiosyncratic. In brief, Monkey S mainly increased movement speed as time passed and also increased speed in the fast blocks compared with slow blocks. In contrast, Monkey Z used a combination of increased speed and reduced amplitude to reduce movement duration after longer decisions. Nevertheless, in all cases, the speed of movements increased with DD, as if the vigor of actions is related to the urgency at the time the decision is made. The activity of the speed-related cell depicted in Figure 12A is consistent with this proposal.
To assess this hypothesis at the population level, we selected all of the PMd and M1 cells that show a significant and positive correlation between movement speed and the firing rate in a 300 ms period preceding movement onset (Pearson coefficient of correlation, p < 0.05). For each of these cells and regardless of the block, we calculated the median RT that divides the trials in two equal-size groups. Finally, we plot for each block the average activity of the speed-sensitive cells in the two categories of trials: the shortest versus the longest decisions.
In Monkey S (Fig. 12C,D), we found that, despite the low number of trials corresponding to the 10 PMd and 9 M1 speed-sensitive cells, movements following the longest decisions were faster compared with those that followed the shortest decisions, especially in the slow blocks of trials (CIs from bootstrap, p < 0.0001). In the fast blocks, the increase of movement speed with DD was smaller but still significant (p < 0.001 and p < 0.05 for the sessions during which PMd and M1 have been recorded in Monkey S, respectively). As predicted by the UGM, we found that movements following the shortest decisions were significantly faster when performed in the fast blocks than in the slow blocks (Fig. 12C,D, CIs computed from a bootstrap test; p < 0.0001). This is in agreement with our observations based on the large behavioral dataset (Thura et al., 2014) and in agreement with the proposal that the context-dependent urgency signal that determines DTs also influences how Monkey S executes his movements. The neural recordings presented in the current paper are consistent with this hypothesis by showing that PMd speed-related cells have a stronger burst after longer decisions (35 vs 31 Hz in the fast blocks, p < 0.01; 35 vs 30.5 Hz in the slow blocks, p < 0.001), presumably leading to faster movements. For the 9 M1 cells recorded in Monkey S, firing rate is also significantly stronger after long decisions compared with short decisions, regardless of the block condition (p < 0.0001). A block-dependent effect of DD on speed-related activity also appears in Monkey Z. For the M1 cells, the effect of DD is only significant in the slow blocks (p < 0.0001; Fig. 12F). In these cells, activity was also stronger in the fast blocks compared with the slow blocks for the shortest decisions (p < 0.05), but not for the longest, as predicted by the UGM. For the 9 PMd cells recorded in Monkey Z, the effect of DD on activity is significant (p < 0.0001) in both the slow and the fast blocks (Fig. 12E). The larger range of the effect of urgency in the slow blocks of trials as well as the amplification of activity in the fast blocks for the shortest decisions (p < 0.001) but not for the longest is in agreement with our detailed behavioral data (Thura et al., 2014). However, because of the low number of speed-related neurons we sampled, any conclusions based on these results must be considered preliminary and require additional study.
Discussion
Reward rate is a major factor motivating animal behavior (Bogacz et al., 2010b; Balci et al., 2011); and to optimize it, animals must adjust their SAT during both decision-making and movement execution (Chittka et al., 2009). SAT adjustment in decisions has generally been discussed in terms of EAMs (Ratcliff, 1978; Bogacz et al., 2010a), in which control of the SAT is equivalent to controlling the accuracy criterion of the evidence needed for commitment. At the neural level, this could be accomplished through a variety of mechanisms, including changing the neural firing threshold for initiating movement, the baseline activity before evidence processing, or the gain with which evidence is processed (for review, see Standage et al., 2014; Heitz, 2014).
Recently, two studies have examined SAT adjustment at the single-cell level. During a visual search task, Heitz and Schall (2012) found that the instruction to respond slowly reduced the baseline activity of FEF visual cells and delayed the onset and gain of their sensory processing, and also reduced the activity of FEF movement cells. During a motion discrimination task, Hanks et al. (2014) found that, when accuracy was emphasized (by delaying reward delivery), activity in lateral intraparietal area exhibited both a decreased baseline and gain but did not change processing onset time or the activity before saccade initiation. Human neuroimaging studies have also reported that an emphasis on speed increases baseline activity in the striatum, the presupplemental motor area, as well as premotor and parietal regions (Forstmann et al., 2008; Ivanoff et al., 2008; van Veen et al., 2008; Bogacz et al., 2010a; Mansfield et al., 2011; Wenzlaff et al., 2011). These findings make good sense because, with a higher baseline and/or gain, it is easier for neural activity to build up to whatever is the threshold (Hanes and Schall, 1996; Roitman and Shadlen, 2002; Ratcliff et al., 2007) or dynamic attractor (Grossberg, 1973; Wang, 2002; Standage et al., 2011) that determines the commitment to a decision.
A variety of models have been proposed to explain neural activity during decision-making tasks. These include the drift-diffusion model (Ratcliff, 1978; Gold and Shadlen, 2007), which assumes perfect integration of sensory evidence to a fixed accuracy criterion, as well as variations in which the accuracy criterion decreases over time (Drugowitsch et al., 2012) possibly due to a rising urgency signal (Ditterich, 2006; Churchland et al., 2008) or in which integration is “leaky” (Usher and McClelland, 2001). One can conceive of each of these models as lying within a space defined by different assumptions about parameter settings (Thura, 2015). For example, the drift-diffusion model lies at a corner corresponding to zero leak and zero urgency, the leaky competing accumulator (Usher and McClelland, 2001) assumes leak but no urgency, the bounded accumulator with urgency (Drugowitsch et al., 2012) assumes no leak but a growing urgency, and the UGM assumes both a large leak and growing urgency (Cisek et al., 2009; Thura et al., 2012). While there is an ongoing debate as to which model is best (Hawkins et al., 2015a; Hawkins et al., 2015b) and whether the answer may be task-dependent, it is important to note that all of these models permit similar mechanisms for SAT adjustment (Heitz, 2014). In particular, an increase in the baseline and/or gain of neural responses would increase the speed and reduce the accuracy of all these models, regardless of where they lie in the space of parameter settings.
Here, we have used the UGM to interpret our data. The reason is that our previous analyses of behavior and neural activity during the tokens task (from the same animals) have already ruled out other classes of models. In particular, we have shown that neural activity in PMd and M1 quickly tracks the state of sensory evidence without integrating it (Thura and Cisek, 2014) (Figs. 4, 5), which is incompatible with the drift-diffusion model and other long-time-constant models. We have also shown that neural activity increases with time even for identical levels of sensory evidence (Thura and Cisek, 2014) (Fig. 3). A short time constant is important in any task in which the sensory information can change at any time, as during natural behavior, and an urgency signal (or some mechanism for reducing the accuracy criterion over time) is optimal for maximizing reward rates in any free response task (Drugowitsch et al., 2012; Gluth et al., 2012; Thura et al., 2012). In the context of the UGM, analyses of behavioral data in our task have suggested that, to adjust the SAT, the animals change the intercept and slope of their urgency signal in a block-dependent manner (Thura et al., 2014). In particular, the signal starts higher but rises more slowly in the fast block than in the slow block (Fig. 3B).
Our neural data are consistent with this proposal. For cells reflecting the deliberation process (“decision cells,” 29% in PMd and 45% in M1), baseline activity is higher in the fast than the slow block (Figs. 6, 7), and their response to equal sensory evidence is stronger in the fast than the slow block, especially during the early part of trials (Fig. 8). This is also consistent with previous studies. In particular, if the urgency signal modulates how sensory evidence is turned into neural activity, then it would explain both the changes in baseline activity (when the evidence for each of two choices is 0.5) as well as the changes in gain after stimulus information is provided, whether or not that gain is time-dependent in a given task (Ditterich, 2006; Cisek et al., 2009; Thura et al., 2012). In other words, despite the significant differences between tasks (static vs changing evidence, instructed vs volitional SAT adjustment), despite the different regions from which neural activity is recorded (frontal vs parietal, skeletal vs oculomotor systems), and despite the differences in the theoretical frameworks used to interpret the data, a consistent theme emerging from all studies of SAT is that it involves adjusting the baseline and gain of neural processing.
At commitment time, decision-related activity tuned to the preferred target reached a peak that was the same in both blocks, at least at the population level (Fig. 9). This supports the idea that commitment occurs when the total intensity of neural activity related to a choice reaches a fixed threshold or dynamic attractor that is consistent across conditions (Hanes and Schall, 1996; Gold and Shadlen, 2007), in agreement with fMRI results in the pre-SMA (Ivanoff et al., 2008). Likewise, Hanks et al. (2014) found that lateral intraparietal area activity just before the saccade was similar for both “speed” and “accuracy” conditions (even though recordings were performed on different days). In contrast, Heitz and Schall (2012) found that activity in FEF movement neurons was lower when animals were instructed to respond more slowly, and suggested that this may be due to the properties of cells downstream of FEF.
Although the activity of decision-related cells around the time of commitment was similar between the slow and fast blocks (Fig. 9), we found that within each block it was correlated with DD in 30% of cells in PMd and M1 (Fig. 10D,F). In particular, for those neurons that were positively correlated with DD, the regression had a higher intercept and lower slope in the fast than the slow block (Figs. 10F, 11D,E); and for neurons negatively correlated with duration, the reverse was seen (Fig. 11I,J). This pattern of results is consistent with the block-dependent urgency signals estimated from behavioral data (Fig. 3B). We also found a similar pattern of results in the premovement activity of cells that correlate with the speed of movement (Fig. 12), suggesting that the same urgency signal energizes both the speed of a decision and the vigor of the selected movement (Thura et al., 2014). However, because of the low number of cells and trials, this conclusion must be considered preliminary.
It is possible that our results are specific to free-response tasks in which sensory information is changing, such as during natural behavior, but not to the kind of static perceptual discrimination tasks usually studied in the laboratory. However, the UGM provides a good fit to such data as well; and indeed, Hawkins et al. (2015a) showed that it fits better than perfect accumulator models without urgency to the classic data from Roitman and Shadlen (2002). Furthermore, Carland et al. (2015) showed that, even during random-dot motion discrimination tasks, the motion signal is not integrated with the long time constant usually assumed by EAMs, but with a time constant on the order of 200 ms. This is consistent with the UGM and with the proposal that neural activity buildup is primarily caused by an urgency signal. Thus, it remains to be seen whether the UGM applies in general or whether there is sufficient justification for proposing a system for changing the mechanism in a task-dependent manner. Nevertheless, as noted above, the widespread finding of baseline and gain changes suggests a strategy for SAT adjustment that generalizes across many scenarios.
If an urgency signal controls the SAT, then what is its source? Previous work implicates the striatum (Forstmann et al., 2008; van Veen et al., 2008; Bogacz et al., 2010a; Nagano-Saito et al., 2012), which projects to PMd and M1 through the globus pallidus and dorsomedial thalamus (Middleton and Strick, 2000). Furthermore, there is compelling evidence that pallidal output regulates movement vigor (Horak and Anderson, 1984a, b; Turner and Desmurget, 2010). Because we observed a strong link between the vigor of movements and the urgency signal derived from choice behavior (Thura et al., 2014), we hypothesize that SATs for deciding and acting may be controlled by a shared urgency/vigor signal that originates in the basal ganglia and controls both the gain of PMd/M1 decision-related cells as well as the premovement burst of cells that influence movement velocity. This leads to the prediction that, while monkeys perform the tokens task, neural activity in the globus pallidus will exhibit patterns resembling our hypothetical urgency signal (Fig. 3B). Namely, it will not be directionally tuned during deliberation but will be related to elapsing time and will change when monkeys switch between a hasty versus a conservative decision policy.
Footnotes
This work was supported by Canadian Institutes of Health Research Grant MOP-102662, the Canadian Foundation for Innovation, Fonds de Recherche en Santé du Québec, the EJLB Foundation to P.C., and fellowships from the FYSSEN Foundation and the Groupe de Recherche sur le Système Nerveux Central to D.T. We thank Marie-Claude Labonté for technical support.
The authors declare no competing financial interests.
- Correspondence should be addressed to Dr. David Thura, Université de Montréal, 2960 Chemin de la Tour, Montréal, Québec H3T 1J4, Canada. david.thura{at}umontreal.ca