Abstract
Reward prospect weighs on motor decision processes, enhancing the selection of appropriate actions and the inhibition of others. While many studies have investigated the neuronal basis of reward representations and of cortical control of actions, the neuronal correlates of the influences of reward prospect on motor decisions are less clear. We recorded from the dorsal premotor cortex (PMd) of 2 male macaque monkeys performing a modified version of the Stop-signal (countermanding) task. This task challenges motor decisions by requiring responding to a frequent Go stimulus, but to suppress this response when a rare Stop signal is presented during the reaction time. We unbalanced the motivation to respond or to suppress the response by presenting a cue informing on three different rewards schedules: in one case, Go trials were rewarded more than Stop trials; in another case, Stop trials were rewarded more than Go trials; in the last case, both types of trials were rewarded equally. Monkeys adopted different strategies according to reward information provided by the cue: the higher the reward for Stop trials, the higher their ability to suppress the response and the slower their response to Go stimuli. PMd neuronal activity evolved in time and correlated with the behavior: PMd signaled first the cue salience, representing the chance to earn the highest reward at stake, then reflected the shaping of the motor choice by the motivation to move or to stop. These findings represent a neuronal correlate of the influence of reward information on motor decision.
SIGNIFICANCE STATEMENT The motivation to obtain rewards drives how animals act over their environment. To explore the involvement of motor cortices in motivated behaviors, we recorded high-resolution neuronal activity in the premotor cortex of monkeys performing a task that manipulated the motivation to generate/withhold a movement through different cued reward probabilities. Our results show the presence of neuronal signals dynamically reflecting the salience of the cue, in the time immediately following its presentation, and a motivation-related activity in performing (or cancelling) a motor program, while the behavioral response approached. The encoding of multiple reward-related signals in this region leads to consider an important role of premotor areas in the reward circuitry supporting action.
Introduction
Animals strategically adjust their behavior to achieve the desired goals. Optimality is often obtained by either acting on the environment or refraining from acting (e.g., when this is no more convenient) accordingly to a specific context. This last one can be defined by the interplay of the different stimuli that inform about the possibility to obtain a reward and, eventually, motivate the subjects to execute (or refrain for) specific actions. To support these processes, the brain must continuously convert contextual information into a motor decision, selecting between action generation and action inhibition.
Electrophysiological, imaging, and lesion studies succeeded in identifying the role of different brain areas in managing motivation-related and other reward-related signals (Kennerley and Walton, 2011; Schultz et al., 2011; Louie and Glimcher, 2012). While prefrontal cortex and parietal cortex seem more involved in encoding the value and salience of the sensory signals used to inform about the incoming reward, the activity in motor cortical areas was proposed to be mostly related to the motivation to perform an action (Bissonette et al., 2014; Bissonette and Roesch, 2016), even if modulated by other factors, such as reward expectation (Roesch, and Olson, 2004; Marsh et al., 2015), reward feedback (Ramkumar et al., 2016), and anticipation of reward delivery (Ramakrishnan et al., 2017).
However, despite these studies, it is still unknown how in motor cortical areas contextual signals are transformed (or integrated) into a specific motivation affecting the balance between action generation and suppression.
To investigate this topic, we combined a cueing paradigm with a well-established motor control task, the Stop-signal task (Logan and Cowan, 1984). This last task requires in most trials (Go trials) the generation of a movement following a Go signal, and in the minority of them (Stop trials) the suppression of the movement when an unpredictable Stop signal is presented after the Go signal.
In our new task, a cue was presented at the beginning of each trial, well before the presentation of the Go and Stop signals, defining unequivocally one of three specific reward schedules: in one-third of the trials, correct Go trials were rewarded more than correctly inhibited Stop trials; in another third, the opposite schedule was adopted; and in the remaining third, correct Stop and correct Go trials were equally rewarded. Our main aim was to have a task able to strongly influence the balance between action generation and suppression.
We recorded neuronal activity from the dorsal premotor cortex (PMd) of monkeys while they were performing the task. PMd is a frontal motor area crucial for the formation of action plans (Wise, 1985; Shenoy et al., 2013), for the executive control of motor acts (Mirabella et al., 2011; Pani et al., 2014, 2018; Saberi-Moghadam et al., 2016; Cirillo et al., 2018), and primarily linked to the motivation to act (Roesch and Olson, 2004). As such, it is an area where motivational signals can be integrated in action decision processes.
We found that the animals adopted a strategy strongly influenced by the cue representing reward schedule, to maximize the chance of getting as more reward as possible. Moreover, we observed that PMd neuronal activity strongly correlated with the adopted strategy.
Specifically, a sustained neuronal activity was observed immediately after the presentation of the most salient visual signals; that is, those cues associated with the chance to earn higher rewards in one trial type (Go/Stop) and so representing a relevant information for the upcoming behavioral plan. Then, the neuronal activity mostly signaled the motivation to perform or to cancel the action, up to the time of movement initiation or suppression, in Go and Stop trials, respectively.
Our results reveal how PMd activity shifts from reflecting the salience of the information provided by the signals received (representing the chance to earn a higher reward in one trial type), to reflecting the shaping of motor decisions by the motivation to move or to stop.
Materials and Methods
Subjects
Three adult male rhesus monkeys (Macaca mulatta; 10-13 kg in weight), designated Monkeys 1-3, participated in this study. Monkey 3 was studied only behaviorally. The monkeys had free access to food and controlled access to water during the experiments. They received fruit juice of different amount as a reward for performing the task. All experimental procedures, animal care, housing, and surgical procedures conformed to the European (Directive 2010/63/EU) and Italian (DL 26/2014) laws on the use of nonhuman primates in scientific research and were approved by the Italian Ministry of Health.
Experimental design
Experiments were performed in a darkened, acoustically insulated room. Monkeys were seated, with the head fixed, in front of a black isoluminant background (<0.1 cd/m2) of a 17 inch touchscreen monitor (LCD, 800 × 600 resolution). Liquid reward was delivered from a tube positioned between the monkeys' lips, and eye movements were monitored by using a noninvasive Eye-tracker (Arrington Research). A noncommercial software package, CORTEX (www.nimh.nih.gov), was used to control the presentation of stimuli and behavioral responses.
Each monkey was instructed to perform a modified version of the Stop-signal task (Fig. 1A) consisting of Go (75%) and Stop trials (25%) randomly intermixed. Each trial started when the monkeys touched a red central target (CT) on the screen. After a variable waiting time (400 ± 20 ms), one of three possible visual cues appeared just slightly above the CT. Cues were black and white abstract images (bitmaps, 16° × 16° of visual angle), and informed about the condition of reward (as explained later). The monkeys were required to continue touching the CT for additional 1000 ms (cue epoch), after which a peripheral target (red circle; PT), corresponding to a Go signal, randomly appeared in one of two possible locations (i.e., at the right or left of the screen vertical midline). In Go trials, the monkeys had to detach their hand from the CT (reaction time [RT]; maximum allowed time: 1400 ms for Monkey 1, and 1200 ms for Monkeys 2 and 3), and to touch the PT for a random time (900-1200 ms), to obtain the reward (correct Go trials). If the monkeys detached after the maximum allowed time or did not detach from the CT, they did not receive the reward and the trial was classified as late response error trial.
Behavioral paradigm and electrophysiological recording site. A, Motivational Stop-signal task (sequence of events for the Go+ condition). Each trial started when the monkeys touched a CT on the screen. After a variable delay, a cue (black and white abstract images) appeared above the central target, informing about the condition of reward (amount of the expected reward; see B). The monkeys were required to continue touching the CT (cue epoch). After 1000 ms, a PT randomly appeared on the right or left of the screen (Go signal) instructing the monkeys to detach their hand from the CT (RT) and to reach the PT (movement time [MT]) to receive the reward (Go trials, 75%). Occasionally, at a variable delay (SSD) from the Go signal, the cue became red (Stop signal). In this instance, the monkeys had to hold the CT to earn the reward (Stop trials, 25%). The white halo around either the central or the PT was used as feedback of touch for the monkeys. Cue, Cue presentation; Go, Go signal; MovON, movement onset; Stop, Stop signal. B, Schematic of the different levels of cue salience and motivation (either to go or to stop) for Go+, Neutral, and Stop+ conditions. The amount of reward for every cue condition was differently associated with Go and Stop trials as follows: Go+ condition was associated with a higher reward for successful Go trials compared with successful Stop trials; Stop+ condition was associated with reversed reward amounts; last, a Neutral control condition was associated with equal amounts of reward delivered for successful Go and Stop trials. C, Array location on the left PMd of Monkey 1 and the right PMd of Monkey 2. Left, Illustrative brain figure representing the locations of arrays based on reconstruction from photographs taken during surgery after dura opening and array insertion. Right, Example of picture taken during surgery (Monkey 1). AS, Arcuate sulcus; PCD, precentral dimple; CS, central sulcus.
In Stop trials, the initial sequence of events was identical to the Go trials. Conversely, at a variable delay from the Go signal (Stop signal delay [SSD]), the cue unpredictably became red (Stop signal) and, in this instance, the monkeys had to hold the central position for an additional interval (900-1000 ms interval) to earn the juice (correct Stop trials). If the monkeys moved the hand from the CT, the trial was considered a wrong Stop trial and no reward was given. Both correct and incorrect trials were followed by a 2000 ms intertrial interval, during which the screen became black.
Within a session, three different cue conditions were selected with equal probability and intermixed in a random trial-by-trial fashion (Fig. 1B). In the Go+ condition, the cue signaled that, if the ongoing trial continued with a request of movement execution (Go trials), the monkey would receive a higher amount of reward respect to the trials ending with a request of movement cancellation (Stop trials). On the contrary, in the Stop+ condition, the amount of reward associated with movement cancellation and execution was reversed. Last, in the Neutral condition, the reward schedule was the same as in the classic Stop-signal task; that is, the reward was equally delivered for successful Stop and Go trials. Figure 1B shows how the three visual cues, in relation to the trial type, could assume a different value in salience and in motivation to either move or cancel the planned movement.
In this regard, both the cue signaling a bigger reward on Go trials (Go+) and the cue signaling a bigger reward on Stop trials (Stop+) are expected to be more salient of the cue associated with the Neutral condition (Neutral). This is because, in both cases, a specific trial type is associated with a higher reward compared with the other. As such, these two cues are relevant for the planning of the subsequent action to maximize the chance to earn the highest reward at stake. As regard the motivation, higher reward associated with Go trials than Stop trials (Go+ condition) is supposed to be linked to a higher motivation to move, compared with the same reward for Go and Stop trials (Neutral) and, in turn, higher reward for Stop compared with Go trials (Stop+). Consistently, the motivation to inhibit a movement would be higher for Stop+ compared with Neutral and, in turn, Go+ condition.
SSDs were presented according to a fixed SSD procedure. Based on the average RT measured at the beginning of each session, four progressively longer SSDs were computed so that, in the Neutral condition, the monkeys were able to successfully inhibit a movement in ∼85%, 65%, 35%, and 15% (and, overall, in ∼50%) of the Stop trials (for a similar procedure, see Mirabella et al., 2011). Whenever, after some trials, we realized that the performance did not satisfy the above-defined criteria, the SSDs were adjusted, and the session was restarted until a good control of the behavior was obtained. For this procedure, we used the Neutral condition as reference. The same SSDs were then used for each of the cue conditions to compare stop performances and measures of inhibitory control. Considering all sessions, the SSDs ranged from 170 to 770 ms (with a 200 ms step) in Monkey 1, from 170 to 770 ms (with a 150 ms step) in Monkey 2, and from 170 to 670 ms (with a 100 ms step) in Monkey 3.
Surgery
A single 96 channel Utah array (Blackrock Microsystem) was implanted over the left PMd of Monkey 1 and the right PMd of Monkey 2 (using as anatomic landmarks, after dura opening, the arcuate sulcus and the precentral dimple; Fig. 1C). The sites of the implants were contralateral to the arm used during the experiment. Furthermore, a head-holding device was implanted in each monkey. All the surgeries were performed under sterile conditions and veterinary supervision. Anesthesia was induced with ketamine (Imalgene, 10 mg kg−1 i.m.) and medetomidine hydrochloride (Domitor, 0.04 mg kg−1 i.m.) and maintained by inhalant isoflurane (0.5%-4%) in oxygen. Antibiotics were administered prophylactically during the surgery and postoperatively for at least 1 week. Postoperative analgesics were given at least twice daily.
Neuronal recordings
Unfiltered raw activity was recorded from each electrode of the Utah array (Blackrock Microsystems) using a TDT System (Tucker Davies Technologies; sampling rate 24.4 kHz). From this signal, we offline extracted the spectral estimate of the multiunit activity (MUA), as previously described (Mattia et al., 2013), by computing the time-varying power spectra P(ω,t) from the short-time Fourier transform of the recorded unfiltered electric field potential in ±40 ms sliding windows (20 ms steps). P(ω,t) was normalized by their average Pref(ω) across the first 1200 s of the recording. Our spectral estimated MUAs were the average R(ω,t) across the 0.3-2.0 kHz band. The extracted MUA is a mesoscopic local signal since considered a good approximation of the average firing rate close to the electrode tip (Mattia and Del Giudice, 2002). Similar approaches have been proposed previously (Supèr and Roelfsema, 2005; Stark and Abeles, 2007; Giarrocco et al., 2021).
Statistical analysis
All the analyses were conducted using MATLAB software (The MathWorks), and statistical comparisons were performed by means of MATLAB and the software Statistica (StatSoft).
Behavioral analysis
We analyzed in detail behavioral data from sessions in adherence with the assumptions of the race model (Logan and Cowan, 1984) (see below; Monkey 1: 2 sessions; Monkey 2: 3 sessions; Monkey 3: 9 sessions).
We tested error rates in Go trials (i.e., the probability of late response error trials) and in Stop trials (i.e., the probability of wrong Stop trials), merging data from different recording sessions, and calculating z scores test (p < 0.05) from each pair of cue conditions, separately for each monkey.
In Go trials, we compared RTs among the three cue conditions, merging data collected in the different sessions; we then performed a one-way ANOVA (α = 0.05) separately for each monkey.
Performance in Stop trials was further evaluated within the framework of the race model (Logan and Cowan, 1984) by calculating the latency of the stop process: that is, the Stop signal RT (SSRT). The race model assumes that, during Stop trials, two stochastic processes race toward a threshold: the go and the stop processes, triggered by the appearance of the Go signal and the Stop signal, respectively. The result of this race, either movement generation in wrong Stop trials or movement inhibition in correct Stop trials, will depend on which of these processes will reach its own threshold first. In correct Stop trials, the stop process wins over the go process, and vice versa in wrong Stop trials. The main assumption underlying the race model is that the go and stop processes are independent to each other (independence assumption). In particular, the model assumes two types of independence: (1) that on a given trial, the latency of the go process does not depend on the latency of the stop process (stochastic independence); and (2) that the go process in Stop trials must be the same as in Go trials since the go process must be unaffected by the presence of the Stop signal (context independence). Only if the independence assumption is validated, the race model could be used for the estimation of the SSRT. To validate the independence assumption, wrong Stop trial RTs must be shorter than the correct Go trials RTs (Schall et al., 2017; Matzke et al., 2018).
For each session, we estimated the SSRT only if the independence assumption was confirmed for the Neutral condition. To calculate the SSRT, we used the integration method: for any given SSD, go RTs are rank ordered and the nth go RT is selected, where n is the number of go RTs multiplied by the probability of responding at a given SSD. SSRT is obtained by subtracting the SSD from nth RT. Since this method assumes the SSRT to be a constant value, averaging the SSRTs obtained at different SSDs provides the final SSRT estimation (Matzke et al., 2018). The SSRT was estimated separately for each cue condition. To compare SSRTs between conditions, we performed a Friedman test for repeated measures (α = 0.05), testing together the SSRTs calculated from the different sessions of the 3 monkeys.
We investigated whether the behavioral performance in our version of Stop-signal task could be better explained by trial history (i.e., effect of the recent type of trial) rather than cue condition performing three different analyses. First, for each session (n = 14 sessions coming from all monkeys together), we sorted trials looking for pairs of successive trials: Go–Go pair formed by the sequence Go trial, Go trial; and Stop–Go pair formed by the sequence Stop trial, Go trial. For each of these sequences, we computed the error rate in the second trial of the pair based on the first trial. The last component of each sequence was distinguished based on the cue condition to observe variations in this trial based on the cue compared with the preceding critical trial. Second, we repeated this analysis for different pairs of successive trials: Go–Stop pair formed by the sequence Go trial, Stop trial; and Stop–Stop pair formed by the sequence Stop trial, Stop trial. Last, for each session, we sorted trials looking for triplets of successive trials: Go–Go–Go triplet formed by the sequence correct Go trial, correct Go trial, correct Go trial; Go–Stop–Go triplet formed by the sequence correct Go trial, correct Stop trial, correct Go trial; Go–Wrong stop–Go triplet formed by the sequence correct Go trial, wrong Stop trial, correct Go trial. For each of these sequences, we subtracted the RT of the first Go trial from the RT of the last Go trial, obtaining post-trial ΔRT. This procedure should help in providing effects related to the close critical trials (i.e., the central trial), avoiding global effects of RT variations (Nelson et al., 2010; Dutilh et al., 2012). The last component of each sequence was distinguished based on the cue condition to observe variations in this trial based on the cue compared with the preceding critical trial. For each of the three analyses, we performed a two-way ANOVA (α = 0.05) with cue condition and trial sequence as factors.
Neuronal analysis
Multielectrode data in this paper come from one session for Monkey 1, and two sessions for Monkey 2. In Monkey 2, only we selected a second session sufficiently separated in time from the other (2.5 months' interval). In Monkey 1, this was not possible, and we preferred avoiding oversampling of the same mesoscopic neuronal activity from each electrode. Data from the same electrode and animal in different sessions were inspected for similarity in the neuronal response in the task (task-related activity; see below).
Task-related activity
We selected recordings (channels) based on their modulation during the task. For each recording, we compared the neuronal activity in correct Go trials during the cue epoch (taking two periods of 100-400 ms after the cue presentation and 540-940 ms after the cue presentation) and during the RT epoch (from −300 ms to movement onset) with the neuronal activity in the baseline period (from −260 to −60 ms before the cue appearance) using a nonparametric Wilcoxon test (α = 0.001). We classified the neuronal activity as task-related whether activity during the baseline was significantly different from activity in at least one of the three epochs. All other analyses were accomplished on task-related recordings.
Cue-related modulation in Go trials
We first focused on three time periods: Cue-Early epoch (100-400 ms after the cue presentation), Cue-Late epoch (540-940 ms after the cue presentation), and After-Go epoch (100–260 ms after the Go signal). For each epoch of analysis, we performed a one-way ANOVA (with cue condition as factor) to evaluate the effect of the cue presentation on the neuronal activity. When p value was significant (α = 0.05), we used multiple comparison post hoc tests to classify recordings, in each epoch of interest, based on the pattern of neuronal activity presented in the different cue conditions. We depicted three main categories in relation to the cue condition (see also Fig. 1B) that together described the neuronal modulation of most recordings: Go+ & Stop+ > Neutral included those recordings showing higher activity for Go+ and Stop+ compared with Neutral (suggesting a prevalent influence of the cue salience), and this difference was significant at least for one comparison (Go+ vs Neutral or Stop+ vs Neutral); recordings whose activity was higher in Go+ compared with Neutral and, in turn, Stop+ were included in the category Go+ > Neutral > Stop+ (suggesting the influence of the motivation to move fully expressed); finally, Go+ > Neutral & Stop+ classified recordings in which Go+ was higher than Neutral and Stop+, whereas these two were not different (here suggesting a motivational signal to move partially expressed). We performed a χ2 test (α = 0.05) to check for significant differences on the frequencies of the three main categories across the three epochs of the task.
We assessed the effect of cue condition on movement direction by comparing the neuronal activity between right Go trials and left Go trials. We adopted a receiver operating characteristic (ROC) analysis: for each recording, we computed a measure of accuracy of the divergence of the neuronal activity (step 20 ms), starting from −100 to 900 ms from the Go signal (that informed about the direction of the movement). ROC values represented the level of discrimination between movement directions. For recordings with significant activity difference (ROC value >0.65 or <0.35 for at least three consecutive bins), we defined the onset of this difference as the first significant time bin. This time was considered as the latency of the discrimination between movement directions (discrimination time). The analysis was conducted separately for each cue condition. The latency distributions in the three cue conditions were then compared performing a one-way ANOVA (α = 0.05) to assess whether they differed from each other. For recordings showing an opposite pattern of modulation (ROC value <0.35), we inverted ROC values (1 – ROC value) and included them with the others.
To investigate how the neuronal activity of the different cue conditions changed over time, we estimated two contrast indices (CIs) comparing each of the two salient cue conditions (Go+ or Stop+) with the Neutral condition. The CIs were called: Go+ versus Neutral and Stop+ versus Neutral and were computed as follows:
Finally, to test whether the neuronal activity was explained by reward-related factors (REWARD), that is, cue salience and motivation to move when the effects of behavioral RTs were factored out, we performed a multiple regression analysis, fitting three models as follows:
Cue-related modulation in Stop trials
To study the contribution of the motivation to the movement inhibition, we first selected those task-related recordings with the neuronal activity involved in movement inhibition (countermanding [CMT] recordings). In the framework of the Stop-signal paradigm, we expect that CMT recordings to be differently modulated when the movement is inhibited with respect to when the movement is made, and that the level of activity during correct Stop trials changes early enough to be able to influence the cancelation of the movement (i.e., in response to the Stop signal presentation but before the end of the estimated SSRT). Against this background, we compared the neuronal activity on correct Stop trials with that one on latency-matched Go trials. These are Go trials that have a similar level of movement preparation of the correct Stop trials (i.e., those trials in which RTs are longer than the specific SSD plus the SSRT). The analysis was accomplished for each SSD separately, putting together the different cue conditions and examining the time from the Go signal to 50 ms after the SSRT, in 20 ms time bins. We excluded the longest SSD (the fourth) from the analysis because of the few correct Stop trials. A recording was classified as CMT if the ANOVA was significant (α = 0.01 for at least two consecutive bins after the Stop signal presentation) in at least 2 of 3 SSDs (for similar approach, see Scangos and Stuphorn, 2010; Mirabella et al., 2011).
We then compared the neuronal activity on correct Stop and latency-matched Go trials separately for each cue condition by means of ROC analysis. For each CMT recording, we computed a measure of accuracy of the divergence of the neuronal activity (step 20 ms), starting 100 ms before the Stop signal to the end of SSRT. For recordings with significant activity difference (ROC value > 0.65 for at least three consecutive bins, same criteria used for the directional analysis above), we defined the onset of this difference as the first significant time bin. We refer to this time as the neuronal stopping time (NST).
We evaluated whether the NSTs differed significantly based on the cue conditions performing a one-way ANOVA (α = 0.05) between the latency distributions. NST estimations that occurred before or within 60 ms following the Stop signal presentation were excluded from this analysis. We also tested whether the proportions of CMT recordings obtained by means of the ROC in the three cue conditions were different through z score tests (α = 0.05) estimated separately for each pair of conditions.
Last, we examined more in depth the neuronal dynamics of recordings modulated after the Stop signal presentation. To this aim, we selected only the recordings whose activity decreased from 60 ms after the Stop signal. We also excluded those recordings (3 in Monkey 1 and 2 in Monkey 2) showing an increased activity in correct Stop trials compared with latency-matched Go trials. For each recording and each cue condition, we calculated the time from the Stop signal presentation at which the neuronal activity began to show a decreasing trend that went on for at least 100 consecutive ms. If this decreasing trend was not found, we excluded the corresponding recording from the following analyses. The time at which the neuronal activity began to show a decreasing trend was defined as the decrease onset time. We considered 200 ms starting from the decrease onset time, and we ran a robust regression to extract the slope of the neuronal suppression pattern. Last, we calculated the mean neuronal activity at the Stop signal presentation (−40 + 40 ms from the Stop). To test the influence of the motivational context over these inhibitory measures, we performed three separate one-way ANOVAs (α = 0.05): one for the decrease onset times, one for the slopes, and the last one for the neuronal activity at the Stop signal.
Population decoding analysis
We performed a neuronal population decoding analysis using the maximum correlation coefficient classifier as previously described (Meyers, 2013). The classifier was trained to discriminate, among the different cue conditions organized on the basis either of the salience or of the motivation, and between the two possible correct responses of the Stop-signal task: movement generation in correct Go trials and movement inhibition in correct Stop trials.
We provided as input (data point) the MUA sampled at 20 ms intervals for each trial, from each task-related recording. For this population analysis, we combined the 80 task-related recordings from Monkey 1, and the 31 task-related recordings from one session of Monkey 2.
We defined the optimal split factor (k) as the highest number of trials available in each condition for each recording. Particularly, the decoding analyses of the “motivation” (Go+ vs Neutral vs Stop+ conditions, putting together correct Go and correct Stop trials) and “cue salience” (Go+ & Stop+ vs Neutral conditions, putting together correct Go and correct Stop trials) factors were performed choosing a k of 164 (i.e., 164 trials × 3 conditions = 492 data points) and 219 (i.e., 219 trials × 2 conditions = 438 data points), respectively, in both cases without having to remove any recording. Similarly, for the decoding analysis of the trial-type factor (correct Go trials vs correct Stop trials), we used a k of 84 (i.e., 84 trials × 2 conditions = 168 data points), without having to remove any recording. The training of the classifier was then performed with pseudo-populations of recordings obtained by randomly arranging all the available data points into a number of split equal to the number of data points per condition, and normalized into z scores to avoid recordings with higher levels of activity dominated the decoding procedure. The classifier was trained using k – 1 number of splits and then tested on the remaining split: this procedure was repeated as many times as the number of splits. The overall decoding procedure was run 5 times with different selection of data in the training and test splits, and the decoding accuracy from these runs was then averaged.
To assess whether the classification accuracy was above chance, a permutation test was performed by randomly shuffling the attribution of the conditions to the different trials, before rerunning the full clutter-decoding experiment. This procedure was repeated 5 times to obtain a null distribution of the decoding accuracies to be compared with the accuracy of the real decoding. The significance level was considered reached if the real decoding accuracies were greater than all the ones of the shuffle data in the null distribution for at least 5 consecutive significant bins.
In the same dataset, we performed a cross-temporal decoding analysis by training the classifier at one point in time and testing its decoding performance at either the same or a different time point. By this, we have been able to investigate in time the dynamic (i.e., no correlation between different time points, corresponding to an evident diagonal band in the plot) or static (i.e., strong correlation for different time points, corresponding to evident square regions in the plot) nature of the population code (for further details, see Meyers, 2013) underlying the representation of “cue salience” and “motivation” in PMd.
Results
Monkeys strategically adjust their behavior according to the motivational context
Monkeys performed the task adapting their behavior to the three conditions (Go+, Stop+, and Neutral), each defining univocally a specific reward schedule as cued by the visual signal presented at the beginning of each trial. To evaluate the behavior, we considered monkeys' engagement in generating versus inhibiting a movement, by analyzing the RTs and the error rates (i.e., the probability to procrastinate the response; late response error trials) in Go trials, and the inhibitory performance in Stop trials (see Table 1).
Details of behavioral performance in the motivational Stop-signal task for the 3 monkeysa
In Go trials, we found that the monkeys were faster and less prone to errors when a larger reward for Go trials was at stake (Go+ condition). Overall, RTs were shorter for Go+ compared with Neutral and, in turn, Stop+ condition (one-way ANOVA and post hoc test, all p values < 0.0001; Fig. 2A).
Behavioral performance according to the motivational context and the trial history. A, RTs on correct Go trials (mean ± SEM, n = 3 monkeys) for the three cue conditions: Go+, Neutral, and Stop+. B, Percentage of errors (error rates) on Go and Stop trials (mean ± SEM, n = 3 monkeys) for each cue condition. C, Estimated SSRTs for each cue condition (mean ± SEM, n = 12 sessions). D, Mean (±SEM) error rates (n = 14 sessions) in Go trials (leftmost panel) and in Stop trials (rightmost panel) following different trial types (Go or Stop trials). Test trials are sorted by cue condition. E, Differences between RTs (ΔRT ± SEM) of correct Go trials (last trial on each triplet; following correct Go trials, correct Stop trials, and wrong Stop trials) and the close in time Go trial (first trial on each triplet) (n = 14 sessions). The cue condition is for the last trial of the triplet. In all analyses, only the factor cue condition was significantly different: *p < 0.05.
Furthermore, monkeys committed fewer errors in Go trials on Go+ than on Neutral and, in turn, Stop+ condition (Fig. 2B). Examining each monkey separately, error rates were different in Monkeys 2 and 3 across all possible comparisons (z score test, all p values < 0.0001); whereas in Monkey 1, error rates in Stop+ where higher compared with both Neutral and Go+ conditions (p values < 0.0001), while no difference was found between these last two conditions (p = 0.503).
In Stop trials, we compared the error rates (i.e., probability to not withhold the response), and the latency of the stop process (SSRT; for details, see Materials and Methods) across motivational contexts. We computed the SSRT only if the independence assumption was validated at least in the Neutral condition, that is, when the amount of the expected reward was the same for correct Stop and correct Go trials (Monkey 1: 2 sessions; Monkey 2: 3 sessions; Monkey 3: 7 sessions). Indeed, when reward was differently manipulated for Go and Stop trials, as for Go+ and Stop+ conditions in this task, the independence could have been compromised as previously shown (Romo et al., 2004; Leotti and Wager, 2010; Matzke et al., 2018).
We found that, when the stopping was more rewarded (Stop+), the inhibitory control of the monkeys was more effective. Particularly, the monkeys committed ∼50% of errors in the Neutral condition, as expected by the similarity here with the classical Stop-signal paradigm (see Materials and Methods), whereas in the Stop+ and Go+ conditions, respectively, less and more errors were committed (Fig. 2B). The different inhibitory control for the three cue conditions was evident in each monkey across all possible comparisons (z score test, all p values < 0.001).
Moreover, the monkeys were faster in inhibiting their response when the stopping was more rewarded than the going; indeed, shorter SSRTs were observed in Stop+ condition than in Neutral and Go+ (Friedman test for repeated-measures, χ2 = 8.7917, p = 0.012; Fig. 2C).
Thus, the contextual information provided by the cue efficiently affected the strategy adopted by the monkeys, producing shifts in the trade-off between responding and stopping based on motivational context.
The strength of the context-related modulation was further confirmed by the analysis of recent history effects on Go trials and Stop trials. Indeed, while performing the Stop-signal task, subjects typically modulate their behavior based on the recent trial history (Emeric et al., 2007; Pouget et al., 2011; Marcos et al., 2013; Genovesio and Ferraina, 2014; Mione et at., 2015; Montanari et al., 2017; Olivito et al., 2017). In this experiment, the recent experienced trial types did not affect the behavior adopted by the monkeys as the motivational contexts defined by the cues did (Fig. 2D,E). Particularly, the error rates in Go trials (Fig. 2D, leftmost panel) and in Stop trials (Fig. 2D, rightmost panel), following different trial types (Go or Stop trials), were affected by the cue condition (two-way ANOVA, cue factor: leftmost panel: F(2,78) = 12.84, p = 0.0000; rightmost panel: F(2,78) = 50.31, p = 0.0000) but not by the trial sequence (Trial sequence factor: leftmost panel: F(1,78) = 0.02, p = 0.8983; rightmost panel: F(1,78) = 0.28, p = 0.6004). Accordingly, RTs of correct Go trials (Fig. 2E), following different trial types (correct Go, correct Stop, or wrong Stop trials), were affected by the cue condition (cue factor: F(2,117) = 45.52, p = 0.0000) and not by the trial sequence (Trial sequence factor: F(2,117) = 0.39, p = 0.6775).
PMd neuronal activity signals first cue salience, then motivation to move
To inspect the neuronal correlates of the adjustment of behavior according to the motivational context, 124 recordings obtained from the PMd in 2 monkeys (80 from Monkey 1 and 44 from Monkey 2) were classified as task-related (see Materials and Methods) and used for the following analyses.
For the first level of analysis, we focused on the MUA neuronal activity observed during Go trials starting from the cue epoch (Fig. 3A). The fixed duration of this epoch (see Materials and Methods) allowed us to explore the evolution of the neuronal activity from the cue signal presentation to the first part of the RT, after the Go signal presentation (After-Go epoch).
Effect of motivational context on the neuronal dynamics during Go trials. A, Average MUAs (±SE) from two example recordings (leftmost panels) and at the population level (n = 124 recordings; rightmost panel) are represented for each cue condition. Shaded gray areas represent the three trial epochs investigated in the following analysis: Cue-Early slightly following the cue presentation, Cue-Late just before the Go signal, and After-Go following the Go signal. B, Percentage of recordings belonging to each category based on their neuronal pattern across cue conditions (one-way ANOVA, p < 0.05; n = 124 recordings). Percentages were calculated separately for the three epochs. Dashed gray line indicates not significant recordings. Solid gray line indicates recordings outside the previous categories. C, Average (±SE) Go+ versus Neutral and Stop+ versus Neutral CIs (n = 124 recordings) as a function of time during the trial. Positive values indicate that the neuronal activity is higher for the salient cue conditions (Go+, purple line; or Stop+, orange line) compared with the Neutral condition, and negative values indicate higher activity for the Neutral condition compared with the salient cue conditions (Go+ or Stop+). Horizontal black line with the asterisk above indicates the times (20 ms steps) during the trial in which there is a difference between Go+ versus Neutral CI and Stop+ versus Neutral CI (paired-sample t test, p < 0.05/86 bin after Bonferroni correction). D, Percentage of recordings with a dependency of neuronal activity on reward-related factors (Reward; white bar), behavioral RTs (black bar), or both (Reward & RT; gray bar) (n = 124 recordings). Percentages were calculated separately for Cue-Early and After-Go epochs.
We found that the neuronal activity reflected at different degrees the motivation to move or to stay, with recordings following the same pattern of behavioral RTs observed in the different contexts (for behavioral details of the sessions used, see Table 2). Particularly, the highest level of activity was found in Go+, in which shorter RTs were recorded, compared with Neutral and Stop+ conditions. However, early after the cue presentation, we also observed a neuronal modulation related to the salience of the cue (see Fig. 1B): in this case, higher level of neuronal activity was present in both Go+ and Stop+ conditions compared with the Neutral condition.
Details of behavioral performance in the sessions used for the neuronal analysesa
Figure 3A shows two example recordings (leftmost panels) and the population activity (rightmost panel) in which the neuronal modulation, related to the cue and to the movement preparation, changed over time and across conditions. In most of the recordings (as in example 1 in Fig. 3A), the neuronal activity reflected the level of motivation to move, showing an ordered pattern: the level of activity was typically higher for the Go+ than for Neutral, and Stop+ conditions. This modulation typically emerged only in the Cue-Late epoch, and became fully expressed in the After-Go epoch. However, in some recordings (as for example 2 in Fig. 3A), the neuronal modulation started right after the cue presentation. Here the activity typically observed was higher for the Go+ and the Stop+ conditions than for the Neutral condition, suggesting that the salience of the cue and not the motivation to move was signaled. Afterward, also in this case, the modulation reflected more the level of motivation to move, becoming higher for the Go+ than for Neutral and Stop+ conditions, as in the other recordings.
The population-wide analysis of the MUA signal confirmed that, after the cue appearance (i.e., during the Cue-Early epoch), the activity in most task-related recordings was higher when salient cues were presented; that is, for Go+ and Stop+ compared with Neutral condition (Fig. 3B, Go+ & Stop+ > Neutral); this pattern weakened in the Cue-Late epoch, and in the After-Go epoch. Conversely, starting from the Cue-late epoch, we observed higher levels of neuronal activity for Go+ than Neutral and Stop+ conditions (Fig. 3B, Go+ > Neutral & Stop+ and Go+ > Neutral > Stop+). The different neuronal modulation observed from the Cue-Early epoch to the After-Go epoch (χ2 test, χ2 = 93.57, p < 0.00001) showed that PMd neuronal activity changed over the trial: it was mostly dependent on the salience at the beginning, whereas, afterward, it was mostly dependent on the motivational context.
To explore at a higher definition, the temporal evolution of the salience and motivational activities, we calculated two CIs emphasizing the difference in neuronal activities both between Go+ and Neutral conditions, and between Stop+ and Neutral conditions (Fig. 3C). In agreement with what was reported above, we observed that, after the cue onset, Go+ and Stop+ conditions were very similar to each other (i.e., the two CIs were not significantly different). Only after 480 ms from the cue, a clear difference between the two CIs emerged and was maintained through the trial (paired-sample t test, see Materials and Methods).
Finally, we investigated whether the neuronal activity over the trials was better explained by reward-related factors only (salience in the Cue-Early epoch and motivation to move in the After-Go epoch), or it was better explained by movement preparation only, and so in association to behavioral RTs. To distinguish the motivation to move from movement preparation, we considered the first one as a coarse measure (see Materials and Methods), defined by three degrees: high in Go+, medium in Neutral, and low in Stop+. We also hypothesized a finer relationship between movement preparation and RTs: the higher the movement preparation, the lower the RT. A multiple least-squares regression approach (as in Roesch and Olson, 2003) revealed that, in the Cue-Early epoch, a model incorporating just reward-related factors (i.e., salience) significantly accounted for the neuronal activity in the great majority of the recordings (56%, 69 of 124, Fig. 3D) with respect to a full model incorporating both the RT and the reward-related factors (F test, p < 0.05). On the contrary, in the After-Go epoch, reward-related factors (motivation) and RT-related effects were strongly yoked (60%, 75 of 124; Fig. 3D), suggesting that the neuronal activity explained by motivation is strictly linked to movement preparation (behavioral RTs), and difficult to fully disentangle.
The overall pattern of activity modulation was also maintained when separating data for different movement directions (Fig. 4A). However, the cue conditions affected the onset of directional selectivity, suggesting an interaction between motivation and movement preparation (Fig. 4B,C). Indeed, discrimination times (i.e., the times in which the activity in each recording diverged between movement directions) obtained performing an ROC analysis were significantly different between cue conditions: shorter discrimination times were observed in Go+ than in Neutral and Stop+ (one-way ANOVA, Monkey 1: F(2,83) = 13.282, p = 0.00001; Monkey 2: F(2,52) = 6.2525, p = 0.00369).
Effect of cue condition and response direction on the neuronal activity in Go trials. A, Average population MUAs (±SE) (n = 124 recordings) for each cue condition and for the response direction to left (Go to left; left panel) and to right (Go to right; right panel). Data are aligned to the cue and to the movement onset (MovON). B, Time evolution of accuracy (auROC) in discriminating between directions for each cue condition (Go+, Neutral, Stop+) and each monkey (Monkey 1: top; Monkey 2: bottom). For each recording, white dots represent the discrimination time, that is, the time at which the accuracy value reaches the threshold (ROC = 0.65). Data are sorted by discrimination times from the presentation of the Go signal. Histograms represent discrimination times for each cue condition together with the average discrimination time (arrows) and the number of recordings reaching the discrimination criteria. C, Average (auROC) value (±SE) across population aligned to the Go signal for each cue condition.
PMd neuronal activity signals differences between motivational contexts at the movement onset
We observed that, in some of the recordings (Fig. 5A, example 2), the neuronal activities were organized according to the motivational context up to the movement onset. In others (Fig. 5A, example 1), the neuronal activities, well separated in the After-Go epoch, overlapped for some hundreds of milliseconds and then separated again around the movement onset.
Effect of motivational context on the neuronal dynamics during the late phases of the movement preparation. A, Average MUAs (±SE) from two example recordings (the same as in Fig. 3A) are represented for each cue condition. Left, Data are aligned on Go signal (Go). Right, Data are aligned on movement onset (MovON). Shaded gray areas represent the two trial epochs investigated in the following analyses: before-MOV slightly before the movement onset and MOV around the movement onset. B, Percentage of recordings with a difference in the neuronal activity based on the cue condition in Go trials (go), in wrong Stop trials (wrong stop), and in a smaller sample of Go trials whose RTs did not differ across cue conditions (RT-matched). C, Average MUAs (±SE) at the population level (n = 124 recordings). Conventions are the same as in A.
We inspected the patterns of activity in relation to the movement onset by considering two epochs for the analysis: one from −200 to −100 ms before movement onset (before-Mov), which should represent the neuronal activity just before command signals are sent from the PMd to other structures to initiate muscle activation (Mattia et al., 2013; Kaufman et al., 2016), and another one, between −40 and 60 ms, around movement onset (Mov). We compared the levels of neuronal activity across the two epochs, and the three cue conditions. A two-way ANOVA shows, at the population level, that the neuronal activities were no different across epochs and between cue conditions for both monkeys (Monkey 1: cue condition: F(1,226), p = 0.48774; Epochs: (F(1,226), p = 0.55093; Monkey 2: cue condition: F(1,85), p = 0.25418; epochs: (F(1,85), p = 0.08).
However, when looking with the same analysis at the single recordings, we found that some of them showed a significant modulation in the two epochs (Fig. 5B, black bars), as the examples of Figure 5A suggested. In the before-Mov epoch, 41% of recordings showed a difference between Go+ and Stop+, with 38% showing higher activity in Go+ and 3% in Stop+; in the Mov epoch, the number of modulated recordings increased to 58%, with the almost totality of them (57%) showing higher activity for Go+. Thus, when looking at the single recordings around the movement onset, the differences between conditions increased (z test, Z = −2.5, p = 0.007). We found the same pattern of increase in modulations in wrong Stop trials (Fig. 5B, dark gray bars), although the number of modulated recordings was slightly lower (from 6% to 35%; z test, Z = −5.1, p < 0.00001). We tested whether these results were confounded by the difference in RT across the conditions. To exclude this confound, we selected smaller samples of trials for each condition (n = 24 or 25 trials for Monkey 1, n = 22-28 trials for Monkey 2) for which the corresponding RTs distributions were not different from each other (Kolmogorov–Smirnov test for each pair comparison, all p values > 0.14; difference Go+ vs Stop+: Monkey 1 = 41 ms; Monkey 2 = 33 ms; RTs median (interquartile range), Monkey 1 = 707 (109); Monkey 2 = 730 (213)). In this control analysis, we found that 7% of recordings showed a modulation in the before-Mov epoch. Importantly, even in this analysis, the number of modulated recordings increased to 30% in the Mov epoch (z = −4.1, p < 0.00001; Fig. 5B, dark gray bars).
These neuronal data show that a progressively higher number of recordings were modulated by the motivational context as they approached the movement onset. Thus, the motivational context was reflected in the neuronal activity even around movement onset, possibly providing vigor to the movement execution. This pattern was evident at the population level (Fig. 5C), even when splitting data for movement directions (Fig. 4A).
In PMd, the level of motivation to move/inhibit influences neuronal modulation during movement cancellation
We also investigated whether and how the motivational context influenced the neuronal activity during movement suppression. To this aim, we selected those recordings showing a role in the inhibition process; that is, their neuronal activity changed early enough in successful Stop trials to be able to influence the cancellation of the movement (i.e., after the Stop signal presentation but before the end of SSRT; for more details, see Materials and Methods). We found that 85% (106 of 124, of which 75 from Monkey 1 and 31 from Monkey 2) of task-related recordings were differently modulated when the movement was inhibited compared with when it was executed. These recordings were classified as countermanding or CMT recordings and used for the following analyses.
Figure 6A shows three examples of CMT recordings in which the neuronal activity in correct Stop trials depended on cue condition. For the most observed patterns (examples 1 and 2), when the Stop signal was presented, the level of neuronal activity was higher, and the divergence between correct Stop trials and corresponding Go trials (latency-matched; see Materials and Methods) occurred later in Go+ condition compared with Neutral and Stop+ conditions. Accordingly, in Stop+, the neuronal activity was still low when the Stop signal was presented, and the divergence between Go trials and correct Stop trials occurred much earlier compared with the other conditions. These patterns were in line with the behavioral findings (i.e., shorter SSRTs in Stop+ condition) compared with Neutral and, in turn, Go+ conditions. Example 3 shows another pattern, although rarely observed (3 recordings in Monkey 1 and 2 in Monkey 2). This less frequently observed activity represents overall the same relationship between motivation and inhibitory control of the largest group of CMT recordings; however, the pattern of discharge is inverted having an activity that is essentially decreasing after the Go signal and increasing following the Stop signal.
Different neuronal patterns for Go and Stop trials based on motivational context. A, Three example recordings differently modulated by the motivational context. For each example, left, middle, and right panels represent the neuronal activity for Go+, Neutral, and Stop+ conditions, respectively. Each panel represents the average MUA (±SE) of correct Stop trials (color lines) and the corresponding latency-matched Go trials (LM, black lines). B, For each cue condition (Go+: left; Neutral: middle; Stop+: right) and each monkey (Monkey 1: top; Monkey 2: bottom), the average population MUAs (± SE; n = 124 recordings; rightmost panels for each cue condition) of correct Stop trials (color lines) and latency-matched Go trials (LM, black lines) are presented. Color plots represent the time evolution of accuracy (auROC) in discriminating between correct Stop trials and latency-matched Go trials (leftmost panels for each cue condition). For each recording, white dots represent the NST (i.e., the time at which the accuracy value reaches 0.65). Red dots represent unconsidered NSTs (either for the too short latency [<60 ms] or because preceding the Stop signal). Data are sorted by NST from the presentation of Stop signal. Histograms represent the NSTs for each cue condition and each monkey, together with the average NST (arrows). NSTs between cue conditions were significantly different in both monkeys (one-way ANOVA; Monkey 1: F(2,176) = 15.812, p =0.00000; Monkey 2: F(2,44) = 8.7740, p = 0.00062).
To have a population estimate of the movement inhibition, we performed a ROC analysis to identify, for each cue condition, the CMT recordings in which the activity diverged between go (latency-matched) and Stop trials, and the time at which this divergence occurred (NST; see Materials and Methods). For the group of rarely observed recordings with an inverted pattern, we inverted ROC values (1 – ROC value) to include them with the others. Figure 6B shows that the average NST varied depending on the motivational context. We found that the average NST was longer in Go+ (mean ± SE: 167 ± 5 ms) than in Neutral (150 ± 5 ms) and, in turn, Stop+ (120 ± 6 ms) conditions (one-way ANOVA; F(2,223) = 19.128, p = 0.0000; for NSTs calculated from each monkey, see Fig. 6B).
We also found that the number of recordings showing the divergence varied depending on the context. Particularly, 70% (74 of 106) of recordings revealed a divergent activity for Go+ condition; this population of recordings significantly increased to 83% (88 of 106) for Neutral (z score test; Go+ vs Neutral: z = −2.2649, p = 0.02382), and achieved the 88% (93 of 106) for Stop+ (Go+ vs Stop+: z = −3.1912, p = 0.00142; Neutral vs Stop+: z = −0.9719, p = 0.33204). In the Stop+ condition, but not in Go+ and Neutral, the divergence between latency-matched Go and correct Stop trials was evident even before or slightly after the Stop signal presentation (as observed in example 2 of Fig. 6A and in the average activity in Fig. 6B): 27% (29 of 106) of the recordings distinguished between Go and Stop trials before or within 60 ms following the Stop signal (temporal limit used as indication of the supposed latency for the information of the visual Stop signal to reach the frontal lobe).
Overall, these findings suggest that the movement inhibition was potentially supported by two aspects: one proactive, related to the level of movement preparation reached when the Stop signal was presented, and determined by the cue condition; and the other, possibly related to the efficiency of the inhibitory signal, expressed as latency of neuronal modulation and as fraction of recordings participating in movement suppression.
Thus, the SSRT was longer in the Go+ condition either because of a higher level of movement preparation reached when the Stop signal was presented, or because of a weaker (slower) stop process, or because of a combination of the two. To investigate more in depth this aspect, we decided to compare across conditions three measures: the level of neuronal activity when the Stop signal occurred, the time at which the neuronal activity in correct Stop trials began to show a decreasing trend (decrease onset time), and, starting from this time, the slope that characterized the neuronal suppression pattern. We calculated these values only in recordings showing a neuronal suppression pattern after the Stop signal presentation (see Materials and Methods). In total, 66, 75, and 41 recordings were analyzed, respectively, for Go+, Neutral, and Stop+ conditions. We found that the motivational context affected all of the three measures.
The levels of neuronal activity at the Stop signal presentation were different between the three conditions (ANOVA: F(2,179) = 8.8378, p = 0.00022): neuronal activity was lower for Stop+, compared with the other conditions (Fig. 7A; Newman-Keuls post hoc test: p values < 0.01); whereas no difference was found between Neutral and Go+ (Newman-Keuls post hoc test: p = 0.217).
Neuronal modulation in Stop trials based on the motivational context. A-C, Average population MUA (±SE) at the Stop signal presentation (A), average decrease onset time (±SE) (B), and average slope (±SE) of the neuronal modulations (C), calculated for each cue condition (n = 66 recordings for Go+, 75 for Neutral, and 41 for Stop+). D, Neuronal modulations in Go and Stop trials for a single SSD (SSD2 = 370 ms) from Monkey 1. Left, middle, and right panels represent the neuronal activity for Go+, Neutral, and Stop+ conditions, respectively. Each panel represents the average MUA (±SE) from one example recording (top panels) and at the population level (n = 75 recordings; bottom panels) of correct Stop trials (color lines) and latency-matched Go trials (LM, black lines).
We also found clear differences in the neuronal dynamics after the Stop signal (ANOVA: F(2,179) = 4.5806, p = 0.01148): the decrease onset time (Fig. 7B) occurred earlier in the Stop+ condition (Newman-Keuls post hoc test: p values < 0.05), than in Neutral and Go+ conditions (Newman-Keuls post hoc test: p = 0.518). Furthermore, the slopes changed depending on the cue condition (ANOVA: F(2,179) = 14.863, p = 0.00000), although in a different manner (Fig. 7C): the Neutral condition showed a steeper slope compared with Go+ and, in turn, Stop+ (Newman-Keuls post hoc test: all p values < 0.05). This was also evident when considering a single SSD across conditions (Fig. 7D).
Thus, in general, we found that the higher the motivation to suppress a movement, the lower the growth of activity and the earlier the modulation of neuronal activity started (after the Stop signal presentation). This dynamic was associated with a specific inhibitory modulation: in Stop+ trials, the slope was less steep because a fast and strong inhibition was not necessary, because of the lower level of movement preparation. Conversely, when a stronger inhibition was necessary because of the advanced movement preparation (as for Go+ and Neutral conditions), the motivation to inhibit a movement was translated in a stronger decrease of activity (i.e., steeper slopes for Neutral compared with Go+) after the Stop signal presentation.
These data support the idea that neuronal correlates of inhibition in PMd reflected both proactive aspects (i.e., level of movement preparation depending on the context) as well as aspects related to the instantiation of the inhibitory process started by the Stop signal.
Cue salience and motivation to move are encoded with different dynamics in PMd
In this experiment, we found that the presentation of cues univocally associated with specific reward schedules had effects on most of the channels: it drove first a response related to the cue salience (which schedule can furnish more reward), and afterward a response related to the specific motivation (i.e., better to be fast in the Go+ to gain more reward). However, it is not clear how the salience and motivation information were encoded at the population level.
To resolve this issue, we performed a neuronal decoding analysis by training a classifier to discriminate between the different conditions. We considered the salient cue conditions (Go+ and Stop+) with the Neutral one for “cue salience” and the three different cue conditions of the task for “motivation” (Go+ vs Neutral vs Stop+), including both go and correct Stop trials. Indeed, in both trial types, information about salience and motivation must be represented in a similar way until the presentation of the Stop signal. A support to this logic of trials selection comes from performing a decoding analysis based on the trial type (Go vs correct Stop) that showed the inability of the analysis to distinguish between these trial types until 200 ms after the Go signal (Fig. 8A).
Classification accuracy over time and cross-temporal decoding plot for “trial type,” “cue salience,” and “motivation.” The decoding analysis has been conducted to test separately the coding of either the trial type (correct Go trials vs correct Stop trials; A) or the cue salience (B) or the motivation (C) at the population level and to investigate the static/dynamic nature of the population coding. Superimposed white lines indicate the classification accuracy of the population (n = 111 recordings) over time. Red lines in the bottom part indicate the times at which classification accuracy is above chance level (50% for trial type and salience and 33% for motivation; permutation test, p < 0.05). White dashed lines indicate the time when the cue and the Go signal were presented. Cross-temporal decoding plot, training and testing the classifier at different time points have tested accuracy of the classifier. A population coding is dynamic if the accuracy decays rapidly when testing the decoder at different time points of those used in the training phase (presence of a diagonal band in the plot); conversely, it is static if the accuracy is preserved across different time points (presence of square regions in the plot). The classification accuracy color scale on the top of each panel is for the cross-temporal decoding plot. Blue represents the threshold level on each plot.
Figure 8B, C shows that PMd population activity encoded (classification accuracy, white line; scale in the right side) the information about the “cue salience” just after the cue appearance (from 80 ms after the cue, Fig. 8B), and that this coding was kept almost constant above chance level (threshold 50%) over time. Conversely, the encoding of the motivation (Fig. 8C, threshold 33%), while starting at about the same time after cue onset, displayed a strength that increases over time.
We confirmed that, at the population level, both salience and motivational signals were coded; as a new result, we found that salience and motivational contexts were coded with different dynamics.
To gain more insights into the nature of the dynamics, we performed a cross-temporal decoding analysis (Fig. 8; color plot on each panel; see Materials and Methods). Figure 8B shows that the static representation of the cue salience had two different phases (square patterns in the color plot) in the early and late part of the cue epoch, respectively. A dynamic code emerged only after the Go signal (as suggested by the diagonal band), purportedly related to the bell-shaped patterns of MUA before movement generation and to the difference of RTs between conditions. The motivation signal (Fig. 8C) displayed a different pattern: a static code emerged only in the late cue epoch, and it was mixed with a dynamic code, that became evident during the RT. Indeed, around the Go signal the motivation should be translated in the proper level of movement preparation as driven by the cue. Thus, PMd activity was characterized by a salience signal that was roughly stable: it started after the cue and was maintained throughout the trial. Differently, the motivation was represented as a growing signal that was translated in the proper level of movement preparation before the Go signal, conforming to what observed at the single recording level. Importantly, for the second half of the cue period these two signals coexisted. This coexistence was further confirmed by the decoding analysis performed on correct Stop and Go trials, where no difference between the two trial types had been detected until the Go signal (Fig. 8A).
Discussion
We investigated how reward- and motivation-related information is reflected in PMd neuronal activity, and how this activity links with behavioral performance in a Stop-signal task during movement generation and inhibition.
Monkeys adapted their behavioral strategies according to reward information provided by the cue to maximize the probability to obtain the higher reward amount. Particularly, when they were more motivated to move because of the higher reward for Go trials, they responded faster in Go trials and committed more errors in Stop trials. Conversely, when stopping was prioritized, they were more accurate on Stop trials and responded slower in Go trials. This is in line with previous findings that reveal the effects of motivational bias on Stop-signal task performance (Leotti and Wager, 2010; Padmala and Pessoa, 2010). Indeed, motivational bias can affect the decision process about whether and when to initiate a response: the probability of responding and withholding will depend on the relative importance of the two goals. Other studies have shown that, in task requiring movement planning under risk, subjects are very good at choosing motor strategies maximizing expected gain, when full information about the stimulus configuration and the assigned rewards is provided before movement onset (Trommershäuser et al., 2003a,b), thus in conditions similar to our task.
We found that PMd neuronal activity after the cue was differently modulated for Go+ and Stop+ conditions compared with Neutral condition. We hypothesized that the two cues associated with Go+ and Stop+ became salient cues for the monkeys because they were associated with contexts where maximum levels of rewards could be earned, and so representing a relevant information for the upcoming behavioral plan. As such, we considered the corresponding neuronal activity as reflecting the salience of cues, instead of other interrelated aspects of reward processing, such as the value of expected rewards or motivation.
The value, intended as the relative anticipated worth that some cue predicts (Bissonette and Roesch, 2016), is determined by the kind of reward (or penalty), its magnitude, and probability (Schultz, 2010). Based on this definition, in our paradigm, the value of the expected reward should be greater when monkeys expect higher reward for Go trials (Go trials representing the majority of trials presented to monkeys compared with Stop trials), with respect to the other conditions. This is different from our observation, in which Stop+ condition, characterized by low occurrence of Stop trials, and so by lower probability to obtain the higher reward in this condition, showed high activity as the Go+. Furthermore, previous studies have shown that the value of the expected reward is not represented by PMd neurons but is better explained by activity in prefrontal areas, such as OFC (Roesch and Olson, 2003, 2004).
It is more difficult to distinguish between salience and motivation at the neuronal level. Although the two are intertwined, it has been suggested that signals related to motivation should be observed in the period leading up the behavioral response, whereas signals related to salience might solely be during the presentation of cues (Bissonette and Roesch, 2016). This is exactly what we observed.
A previous study stated that PMd represents the motivation to act (Roesch and Olson, 2004). However, the experimental designs of this and following studies (Lin and Nicolelis, 2008; Matsumoto and Hikosaka, 2009; Leathers and Olson, 2012) manipulated appetitive and aversive stimuli to dissociate value signals from signals related to motivation and salience, whereby they cannot demonstrate a distinction between these last two factors since they covary.
We created a task in which the neuronal activity explained by salient cues does not covary with the activity explained by the motivation to perform/suppress an action (for a similar approach, see Lin and Nicolelis, 2008). This allowed us to deduce that the early neuronal pattern of modulation is strictly related to the saliency of cues. However, it is possible that this PMd activity does not reflect salience per se, but some aspects related to general motor readiness useful to earn the highest level of rewards anticipated by the cue. Future studies will be necessary to unravel these factors.
This pattern of activity changed over time, becoming sensible to the degree of motivation when monkeys were preparing their response after target presentation. The neuronal activity increased more when a large reward was expected for Go trials, compared with the Neutral and the Stop+ conditions, thus paralleling the RTs. Similar patterns have been observed, at the neuronal level, when investigating the neuronal correlates of speed-accuracy tradeoff in the saccadic system (Heitz and Schall, 2012).
Around movement onset, this modulation further increased. A similar observation has been reported for neurons in SMA (Scangos and Stuphorn, 2010). Possibly, this signal is available in PMd to provide vigor to the forthcoming movement or to be integrated in the online control of movements. However, in most of the recordings, the activity around movement onset was unaffected by the motivation. This confirms that PMd provides a heterogeneous contribution to the maturation of the motor plan (see also Mattia et al., 2013). Whether, in specific cases, a threshold-like mechanism is emerging for the generation of movements is a question that requires appropriately designed protocols.
In correct Stop trials, after the Stop signal, the neuronal activity decreased early when a higher reward was at stake for Stop trials, compared with the other conditions. These data suggest a “faster” and purportedly more efficient stop process when monkeys were more motivated to suppress actions, as also confirmed by the shorter SSRT. Support to our observation comes from EEG and imaging studies using the Stop-signal task, in which different rewards were indicated by the color of the Stop signal. These works found that the factor reward was able to affect the response inhibition (Boehler et al., 2014; Schevernels et al., 2015). However, in our study, the motivational context was associated with specific preparatory states before Stop signal presentation. Thus, the differences in inhibition patterns between conditions are also related to proactive processes initiated by the cues at the beginning of the trials (Leotti and Wager, 2010; Greenhouse and Wessel, 2013). A schematic implementation of these ideas is represented in Figure 9.
Changes in the architecture of the hypothetical neuronal units processing the Go and the Stop signals in relation to the different contexts. The architecture is inspired by a model proposed by Schall's group for saccadic eye movement (Boucher et al., 2007). The Neutral context represents the typical dynamics hypothesized in the Stop-signal task. Go and Stop units are under the influence of external inputs (arrows) and under reciprocal inhibitory control (lines with ending dots). The thickness of the lines indicates the different strength of the connections in the three contexts and suggests a possible mechanism for the influence of motivation on neuronal modulation, in both Go and Stop trials. In the Go+ context, motivational factors induce an increase of the activity of the Go units and a reduction of the activity of the Stop units. This unbalanced relationship between inputs could potentially also affect the dynamics of reciprocal inhibition, further reducing the role of Stop units when the Stop signal is presented. In the Stop+ context, the reverse is observed. The changes in the hypothesized schematic model of interactive units could explain the patterns of modulations we observed in the different contexts. For example, in the Stop+ context, the low activity (as evident at the time of Stop signal presentation; see Fig. 7A) is associated with an early decrease in activity (see Fig. 7B) but to an almost flat slope (see Fig. 7C) when the Stop signal is successfully processed, because the level of activity is maintained far from the one necessary to generate a movement by the less strong external inputs to the Go units and the increased inhibitory influence from Stop units. In Go+ and Neutral contexts, the modulation after the Stop signal presentation is observed roughly at the same time (see Fig. 7B). However, in the Neutral context, it is possible to observe a steeper slope than in the Go+ context (see Fig. 7C) in response to the Stop signal probably because in Go+ the Stop units are less effective. The modulation observed, in particular during the Stop trials, suggests that the motivation to either move or inhibit could act by enhancing the influence of the specific units on the overall neuronal architecture subtending motor decisions.
We showed a direct relationship between level of MUA and motivation to move or movement preparation. After the Stop signal, the activity decreased. However, we found also MUAs showing an opposite pattern (i.e., an increase of activity following the Stop signal). The number of recordings with these modulations was small, and potentially related to the layer of recordings accessible by the Utah electrode arrays. Indeed, decreasing patterns are mostly observed in deeper layer in PMd (Chandrasekaran et al., 2017).
Together, our results may be interpreted in the framework of decision-theoretic models (for review, see Andersen and Cui, 2009; Cisek and Kalaska, 2010; Cisek, 2012). When animals can choose between different potential actions, the variation in the expected reward exerts a correlated influence on both the choice behavior and the neuronal activity in some motor-related areas (Platt and Glimcher, 1999; Sugrue et al., 2004; Pastor-Bernier and Cisek, 2011). The activity in these regions reflects both the action selection and movement preparation (Shadlen and Newsome, 2001; Romo et al., 2004; Cisek and Kalaska, 2005; Mione et al., 2020). In this framework, Pastor-Bernier and Cisek (2011) have conducted a study in which a monkey was trained to choose between two targets differently associated with reward values. The results showed that PMd activity was modulated by the relative value of potential targets, thus suggesting that decisions between actions are determined by a competition between action representations which takes place within sensorimotor circuits. Nevertheless, this idea has mainly been tested for spatial target selection. Our experimental design does not allow explicitly discriminating between activity patterns associated with spatial coordinates that correlate with different potential actions. Although we cannot discriminate between activity patterns associated with different potential actions, we can suppose that PMd may first represent decision-related variables, such as the level of evidence in favor of a given behavioral choice: after the cue, PMd was more activated when the monkeys preferred a behavioral strategy (move or withhold the movement), depending on the expected reward amount; whereas, later in the trial, the activity reliably represented the monkey's response choice.
Finally, we must remember that our results are based on the analysis of the modulation of a mesoscopic signal, attributable to the average firing rate of a population of uncertain size near the electrode tip. Although similar approaches have been used in the past (Supèr and Roelfsema, 2005; Stark and Abeles, 2007; Drebitz et al., 2019; Ahmadi et al., 2021), it is difficult to estimate how much of this modulation is visible even at the level of single neurons. Our own experience says that generally what is evident at the mesoscopic level is also visible in neurons isolated from the same electrode, if this is possible given the signal-to-noise ratio. However, MUA and SUA are definitively different signals. In relation to the goal of the analysis, the MUA could be a better decoder of the underlying dynamics (Ahmadi et al., 2021).
Footnotes
This work was supported in part by European Union Grant BRAINLEAP GA 306502 and Sapienza grant CIDES PH11715C823A9528.
The authors declare no competing financial interests.
- Correspondence should be addressed to Stefano Ferraina at stefano.ferraina{at}uniroma1.it or Pierpaolo Pani at pierpaolo.pani{at}uniroma1.it