Abstract
Feedback monitoring and adaptation of performance involve a medial reward system including medial frontal cortical areas, the medial striatum, and the dopaminergic system. A considerable amount of data has been obtained on frontal surface feedback-related potentials (FRPs) in humans and on the correlate of outcome monitoring with single unit activity in monkeys. However, work is needed to bridge knowledge obtained in the two species. The present work describes FRPs in monkeys, using chronic recordings, during a trial and error task. We show that frontal FRPs are differentially sensitive to successes and failures and can be observed over long-term periods. In addition, using the dopamine antagonist haloperidol we observe a selective effect on FRP amplitude that is absent for pure sensory-related potentials. These results describe frontal dopaminergic-dependent FRPs in monkeys and corroborate a human-monkey homology for performance monitoring signals.
Introduction
Performance monitoring, i.e., the continuous checking of goal achievement (Ullsperger, 2006), lies at the heart of adaptation by inducing the regulation of cognitive control, emotional responses, and motivational adjustments. Experiments in humans have revealed a neural signature of performance monitoring: a frontal medial evoked potential [error-related negativity (ERN)] observed during incorrect motor performance (Falkenstein et al., 1991; Gehring et al., 1993). A frontal negative signal has also been observed in relation to external feedback of performance [feedback error-related negativity (fERN)]. The two signals, ERN and fERN, might reflect the same underlying mechanism (for review see Holroyd and Coles, 2002). These signals have since been correlated with subsequent behavioral adaptation suggesting a role in adjusting performance and in reinforcement learning (Gehring et al., 1993; Debener et al., 2005; Frank et al., 2005; Cohen et al., 2007; Taylor et al., 2007). The brain potential is sensitive to pharmacological challenges in particular when aminergic transmission is concerned. Haloperidol (a dopaminergic antagonist) reduces the amplitude of ERN although it induces mixed or even no behavioral effects (Zirnheld et al., 2004; de Bruijn et al., 2006). This supports the idea of functional relationships between dopamine, prediction error signals, and the ERN and fERN (Holroyd and Coles, 2002). Importantly, the negativity appears abnormal in a wide range of neurological and psychiatric disorders. This, in itself, is of considerable interest and opens the door to preclinical or clinical studies using inexpensive noninvasive functional evaluations of adaptive cognitive processes.
The ERN and fERN might be used to indicate the integrity of a whole system rather than to directly measure local processing of errors (Ullsperger, 2006). The neural origin of ERN and fERN indeed does encompass several brain structures devoted to reinforcement learning and cognitive control. The anterior cingulate cortex (ACC), the striatum, the orbitofrontal cortex, the lateral prefrontal cortex, the supplementary eye field, and the aminergic systems are directly or indirectly involved. The ACC is of particular interest since its activation increases in concert with the production of the ERN and because source reconstructions often point to the ACC (Dehaene et al., 1994; Debener et al., 2005). An influential hypothesis posits that the error-related negativity is generated from ACC activity changes when the consequences of an action are worse than expected (Holroyd and Coles, 2002). Holroyd and Coles referred to Schultz's (2000) work in monkeys, showing that dopaminergic neurons increase and decrease their activity for respectively positive and negative reward prediction errors. They proposed that through the direct mesocortical dopaminergic pathway, a dopamine-mediated negative reward prediction error signal disinhibits ACC neurons, which thereby produce the cortical error signal. A recent extension posits that conversely a positive prediction error should inhibit ACC feedback-related activity and thus reduce surface feedback-related potentials (Holroyd, 2004; Holroyd et al., 2008).
Before the discovery of a human ERN, Brooks (1986) had observed a local field potential evoked by incorrect motor performance recorded in the vicinity of monkey's ACC (Brooks, 1986). Later, local recordings in the banks of anterior cingulate sulcus revealed increased activity after errors, reduced rewards, and the absence of expected rewards (Ito et al., 2003; Amiez et al., 2005; Emeric et al., 2008; Quilodran et al., 2008). Although error-related activity has been reported in other areas (e.g., supplementary eye field, lateral prefrontal cortex, and orbitofrontal cortex), the incidence of outcome-related activity is particularly high in the ACC. Yet, the homology between human and monkey ACC regarding performance monitoring has been questioned because of recurring contradictions between data obtained in monkeys and humans (Botvinick et al., 2004).
In this context, we used chronic frontal transcranial recordings in behaving monkeys to establish three issues: First, we show a frontal medial surface potential related to performance feedback (FRP) and modulated during cognitive tasks; Second, the monkey feedback-related potential is sensitive to dopaminergic transmission; Third, we confirm with this model that long-term FRP follow-up can be performed for longitudinal investigations.
Materials and Methods
Housing, surgical, electrophysiological, and histological procedures were performed according to the European Community Council Directive (1986) (Ministère de l'Agriculture et de la Forêt, Commission nationale de l'expérimentation animale) and Direction Départementale des Services Vétérinaires (Lyon, France).
Subjects.
Two 14-year-old rhesus monkeys (Macaca mulatta; monkey S and R) served as subjects in this study, one male and one female weighting 8 kg and 7 kg, respectively. During sessions, the animal was seated in a primate chair (Crist Instrument) within arm's reach of a tangent touch-screen coupled to a TV monitor (Microtouch System). In the front panel of the chair, an arm-projection window was opened, allowing the monkey to touch the screen with one hand. A computer recorded the position and accuracy of each touch. It also controlled the presentation via the monitor of visual stimuli (color shapes), which served as light-targets (CORTEX software, National Institute of Mental Health Laboratory of Neuropsychology, Bethesda, MD). Eye movements were monitored using an Iscan infrared system (Iscan). Four target items (disks of 5 mm in diameter) were used: upper left (UL), upper right (UR), lower right (LR), and lower left (LL) (Fig. 1A). A central white square served as fixation point (FP). The lever was disposed just below the FP. Reward (fruit juice) was delivered via a reward-delivery-system (Crist Instrument). All neurophysiological recordings were performed with an Alpha-Omega multichannel system (AlphaLab, AlphaOmega). Analyses of neurophysiological signals were made using NeuroExplorer, Elan-Pack (Inserm U821, Lyon, France), and MatLab homemade scripts.
Behavioral task.
Monkeys were trained in a problem solving task (PST) (Procyk and Goldman-Rakic, 2006); they had to find by trial and error which target, presented in a set of four, was rewarded (Fig. 1A,B). Each trial started by the onset of a starting target named “lever.” The animal had to initiate trials by touching the lever and maintaining his touch. A FP appeared and the animal had to fixate it. A delay period (2 s) followed, and ended by the simultaneous onset of the four gray targets and offset of FP. At the FP offset, the animal is required to make a saccade toward one target, fixated it (0.5 s), and then touched it following a GO signal (all targets turned white). The time delay from target fixation to GO signal was fixed thus leading to anticipatory patterns of reaction times. If the choice was incorrect (no reward, negative feedback), the monkey could select another target in the following trial and so on until the solution was discovered (end of search period). The animal was then allowed to repeat for at least 3 trials the correct choice (3 trials in 90% of cases; 7 or 11 trials in 10% of cases). Each block of trials (or problem) thus contained a search period and a repetition period. A visual signal [signal to change (SC)] at the end of the repetition period indicated the beginning of a new problem. The new correct target was selected such that it was different from the previous one in ∼90% of cases. Any break in fixation requirements resulted in trial cessation (break fixation error).
Experimental schedule.
One important aspect of the experiment was to verify the existence of FRP and their sensitivity to various behavioral parameters. In the present paper we focus on data from two phases (0 and I) for behavior and for phase I for event-related potentials. The time line of the protocol is presented in Figure 1C. In the initial phase (phase 0) of testing all targets turned gray at the touch, and after a 0.4 s delay all targets switched off and the outcome was given. A reward (fruit juice) was delivered for choosing the correct target (positive feedback). If the choice was incorrect no reward was given (negative feedback). In the subsequent phase of testing, visual feedbacks were used to dissociate in time visual performance feedbacks from liquid rewards, and to test the effect of presenting new feedbacks on feedback-related potentials. In phase I, 400 ms after the touch, targets turned red (negative feedback) or green (positive feedback) for incorrect and correct responses, respectively. This visual feedback was displayed for 500 ms and followed (when correct) by a reward delivery. The SC was changed to be identical to the negative feedback (Fig. 1B). In the days following phase I, we tested (1) the effect of changing visual feedbacks: yellow stars replaced red disks for negative feedback and the SC was changed accordingly (only data for negative feedback change are reported), (2) the reliable presence of feedback discrimination after 7 months (long-term test), and (3) the effect of acute systemic Haldol injections (Fig. 1C).
Phases are composed of sessions (one session corresponding to one day of recording). However, for the purpose of event-related potential (ERP) analyses phases were subdivided in steps, each step being composed of several sessions averaged to obtain an equivalent total number of trials and with at least 50 events by trial type (see below). Phase 0 was composed of the 11 sessions preceding the insertion of visual feedback of performance and that met the requirements for number of trials. These 11 sessions were grouped in 2 successive steps (steps 1 and 2). Phase I was subdivided in 3 steps (steps 3 to 5). Data from one step before (step 6) and two steps after (steps 7 to 8) negative feedback change were also studied. Phase I covered 3 months of recordings for both monkeys. One week separated the last session of phase I from the first session of step 6 for both monkeys. Tests over steps 6–8 covered 3 months of recordings for both monkeys. The first session of phase I was separated from the first session of long-term test by 7 months in both monkeys.
Surgical procedures.
Surgical procedures were performed under aseptic conditions. Animals were implanted with a head-holder and intracranial electrodes. Following premedication with atropine (1.25 mg, i.m.) and dexamethasone (4 mg, i.m.), chlorpromazine (Largactil 1 mg/kg, i.m), anesthesia was induced with ketamine hydrochloride (20 mg/kg, i.m.). Anesthesia was maintained with halothane in N2O/O2 (70/30). Heart rate was monitored and artificial respiration adjusted to maintain the end-tidal CO2 at 4.5–6%. A bar was attached to the skull with small stainless steel screws and then embedded in an acrylic assembly to permit subsequent head fixation. Using stereotaxic guidance, 15 stainless steel surgical screws (Synthes) were fixed in the skull and connected to a standard female D25-pins connector. The ensemble was then anchored with dental acrylic to the head-holder. The screws served as transcranial electrodes that were expected to touch the dura. The 14 electrodes implanted 5 mm apart from each other covered a surface area of 175 mm2 over the anterior midline (Fig. 1D). One electrode serving as reference was screwed on the midline anterior to the set of the 14 active electrodes. The most posterior electrodes were placed at anterior level +25. To keep the connector free of debris a male connector was placed and fixed into the implanted connector at return of the animal to the home cage. After surgery, monkeys were kept under observation; to prevent pain, morphine was administered after the anesthesia began to wear off; antibiotics were given before surgery and lasted for 6 d.
Drug testing.
We used an antagonist of essentially D2–D4 dopamine receptors: haloperidol (Haldol, Janssen-Cilag, 5 mg/ml for injection). Drug doses used in testing sessions were defined based on tests for side-effect. Doses were reduced until no global behavioral effects [drowsiness and extra-pyramidal effects: akathisia, dystonia, akinesia, tremor, tardive dyskinesia, due to the action of the drug on the extra-pyramidal system (Coffin et al., 1989)] could be observed in the home cage. We tested doses of 0.02, 0.01, and 0.005 mg/kg (i.m.) and selected 0.01 mg/kg for the final recording sessions.
The effect of haloperidol was tested in 2 sessions for monkey S and 4 sessions in monkey R. We tested haloperidol in a minimal number of sessions to avoid effects of repeated challenges. Indeed, chronic exposure to haloperidol upregulates D2-dopaminergic receptors and downregulates D1 receptors inducing cognitive dysfunctions (Lidow and Goldman-Rakic, 1994; Castner et al., 2000; Silvestri et al., 2000). In our protocol, a minimum washout period of 10 d was observed between two consecutive drug challenges. Behavioral performances and EEG activities were compared to sessions preceding drug treatment. To evaluate the effect of time, data acquired 30 min or 3 h after injection were analyzed separately. Testing was performed 11 months after the beginning of Phase I for both monkeys.
Behavioral data analysis.
Reaction times (RT) of arm movements from lever to target were computed on each trial. Latencies of saccades from FP to peripheral targets were measured using automatic detection of deviation of the horizontal component of saccades using a Matlab homemade script. Other parameters were controlled daily to evaluate performance as follows: n1 and n2, mean number of trials to find the correct target and to complete the repetition period, respectively; M (Motivation), number of trials initiated by the animal (that is during which the monkey at least touched the lever) over the total number of trials presented, considered as reflecting the motivational state of the animal; P (Perseverance), number of times an incorrect choice is immediately repeated in the search period, which we named perseverance (expressed in average number per problem solved); Shift, the average number of shifts away from the correct response per repetition period.
Parameters n1 and n2 were used during initial training sessions and compared to optimal values to increase task difficulty (number of targets). Optimal performances were calculated for an ideal situation where errors in search are not repeated and the correct response is repeated without errors (Procyk and Goldman-Rakic, 2006).
Trial types were identified according to their position in the search and repetition periods. Only incorrect trials (INC) in search periods are considered. Correct trials were grouped as correct trials from the search (CO1) and from the repetition (COR, i.e., second, third and fourth correct trials).
Electrophysiological recordings.
Two weeks after surgery, electrophysiological recordings were initiated. All electrodes were referenced to the most frontal electrode (Fig. 1D). The signal from each electrode was amplified and filtered (1–250 Hz), and digitized at 0.8 kHz. ERPs were analyzed off-line (NeuroExplorer software, and Matlab home-made scripts). We analyzed ERP peak latencies and amplitudes for target onset and FRPs. ERPs were averaged for each session and the mean amplitude of the 200 ms period before the onset of a particular event was subtracted from the averaged ERPs for baseline correction. Sessions with <10 problems solved were excluded from analysis, and averaged ERPs embodied at least 260 events by trial type and by step. For the description of FRPs' components we based our measures on grand average waveforms covering the entire phase I during which no change of feedback occurred. Peak amplitudes and latencies were measured by detecting maximum or minimum average amplitude within selected time windows. Windows were defined based on the observation of overall ERP shapes (see Results).
To detect the latency of the difference in average ERPs between negative and positive feedbacks, we performed an ANOVA [time bins × feedback (INC, COR)] using a helmert contrast on time bins. ERPs measures were computed on successive 20 ms time bins. The helmert contrast contrasts the second level with the first, the third with the average of the first two, and so on, and thus enables a detection of the first time bin showing a significant difference between two conditions.
Peaks of difference-waves were also analyzed; this measure, commonly used in human studies to isolate components of interest was applied to contrast ERP for negative and positive feedbacks. In addition this measure allowed us to address the important issues of varying degrees of difference between negative and positive feedback-related potentials (Holroyd et al., 2008). As described in the result section our analyses focused on the 0.15–0.25 s window. The average of two electrodes presenting the greatest amplitude at first positive peak on INC, CO1, COR, and on INC-COR difference maps were used for FRPs analyses (Electrodes E5 and E12) (Fig. 1D). For haloperidol testing, we compared ERP and difference waves obtained during haloperidol sessions to those recorded during earlier control sessions. Differences were tested by calculating prediction intervals from control sessions, and by performing permutation tests on control and test sessions (p values were estimated from 10,000 permutations. See supplemental notes for details, available at www.jneurosci.org as supplemental material). MatLab, R v2.5.0 (R Foundation for Statistical Computing), Statistica (StatSoft), and R (R Foundation for Statistical computing v2.5.0) were used for analyses and graphics. Alpha level rates were set at 0.05 for all analyses.
Reconstructions of surface maps of potentials were performed with the software package for electrophysiological analysis (ELAN-Pack) developed at the Inserm U821 laboratory (previously U280; Lyon, France; http://u821.lyon.inserm.fr/). Each electrode was given spherical coordinates on a unit sphere. To visualize potentials distribution, values were interpolated with spherical spline functions (Perrin et al., 1987, 1989). To visually compare maps between subjects, data were normalized with the average reference as for human high-density recordings (Handy, 2005).
Results
During phase 0 of the protocol trial outcomes were only indicated by the presence or absence of reward delivery. In phase I we introduced visual cues—i.e., performance feedbacks—signaling an impending reward or no reward.
Behavior
Longitudinal testing
Performances in the problem-solving task were near optimal and stable for both monkeys during phase 0 (supplemental Fig. S1A–D, supplemental notes, available at www.jneurosci.org as supplemental material). We checked for stability with a linear fit on measures for each parameter over the 11 sessions preceding phase I. Only parameter n2 (number of trials in repetition) showed a significant reduction for monkey R. In fact, the history of training for animal R was different in that phase 0 was 1 month after the first practice of the final task with control on eye movements (monkeys were first trained without eye control). For monkey S, the data for phase 0 were taken 5 months after the first practice with control on eye movements.
Before addressing the specific effect of inserting visual feedbacks we tested the overall effect of steps on performance. Among behavioral parameters only Shift in repetition for both monkeys [one-way ANOVA, factor “step” (steps 1–5), at p < 0.05; Shift: F(4,25) = 2.95, p = 0.041; F(4,19) = 4.56, p = 0.0095 for monkey S and R, respectively] and Motivation for monkey S changed with steps (M: F(4,25) = 4.66, p = 0.006).
Data for reaction times (RTs) and saccade latencies were extracted for each session (day of recording). RTs were longer in search than in repetition periods on average over all sessions for monkey S (paired t test over phases 0–I sessions; t = 8.18, df = 29, p < 0.0001) (Fig. 2A) and only over phase I sessions for monkey R (paired t test over phase I sessions; t = 3.61, df = 23, p = 0.0041) (Fig. 2B,D). These interperiod differences are in accordance with previous observations on changes (increase or decrease depending on individuals) between search and repetition periods (Procyk and Goldman-Rakic, 2006; Quilodran et al., 2008). RTs in repetition were strongly affected by steps (one-way ANOVA, factor “step,” all tests at p < 0.05, for the two animals), with changes possibly triggered by feedback insertion (see below) for monkey S and R (Fig. 2C,D). The search versus repetition difference remained along sessions for monkey S while it was stably expressed only in phase I sessions for monkey R, suggesting changes in skill or strategy.
As for RTs, saccade latencies were different between search and repetition for both monkeys although not with the same pattern (paired t test over all sessions; monkey R, t = 11.39, df = 23, p < 0.0001; monkey S, t = −3.41, df = 29, p < 0.005) (Fig. 2E,F).
Inserting positive and negative visual feedbacks
The insertion of visual feedbacks induced slight changes in performance (supplemental Fig. S1, supplemental notes, available at www.jneurosci.org as supplemental material). It is interesting to note that at this stage monkeys could still rely on the delivery of reward at the extinction of visual feedbacks. The visual feedbacks were just giving anticipatory information on the impending occurrence of reward delivery.
At the insertion of visual feedbacks RTs for search trials were increased in the two monkeys (Student's t test between steps 2 and 3 with individual trials: t = −12.74, df = 2004, p < 0.0001 for monkey S and t = −2.11, df = 1892, p = 0.035 for monkey R). Both monkeys also showed significant changes in RTs in repetition after inserting feedback (Student's t test between steps 2 and 3: t = −12.34, df = 2651, p < 0.0001 for monkey S and t = 4.07, df = 2748, p < 0.0001 for monkey R) (Fig. 2C,D). In addition, data for monkey S revealed a global reduction of RTs across phase I steps of the protocol that might correspond to a general learning process initiated by feedback insertion (One-way ANOVA, factor “step” (steps 3 to 5), at p < 0.05, search F(2,16) = 5.05, p = 0.020 and repetition F(2,16) = 4.52, p = 0.028).
Saccade latencies revealed slight and inconsistent changes after feedback insertion, with a decrease of latencies in repetition for monkey R (t test between steps 2 and 3 with individual trials, t = 2.94, df = 2612, p = 0.003) and an increase of latencies in search for monkey S (t test between steps 2 and 3, t = −1.9983, df = 1896, p = 0.046).
Feedback-related potentials
Analyses of event-related signals during phase 0 revealed brain potentials that differed between positive and negative feedbacks. However, reward delivery in case of correct trials prevented a pure comparison between the two trial types. We thus inserted visual feedbacks that preceded actual outcomes by 500 ms, without any other change in the task. When aligned on visual feedback onset, the recordings made in the subsequent phases revealed feedback-related potentials sensitive to successes and failures. We describe here the main characteristics of these potentials.
The main potentials described in this study will be labeled FRPs. Note that the configuration of our electrodes and reference has no correspondence with the usual ones used in humans. Thus, the most important aspect of potentials will be the presence of significant discriminations between different feedback types and in particular between negative and positive feedbacks.
Shape, latencies, and valence
Signals recorded during phase I averaged from two electrodes were selected for illustration (see Materials and Methods) (Fig. 1D). These same electrodes were used for all subsequent analyses. We identified several peaks with similar latencies for the two monkeys (supplemental Fig. S2A, available at www.jneurosci.org as supplemental material). Latencies at peak and peak amplitudes for the two animals are described in supplemental Table S1, available at www.jneurosci.org as supplemental material. Two major events are analyzed here: the most positive value within 0.15 to 0.25 s [early feedback-related potential (eFRP)], and the most negative value within 0.25–0.35 s [late feedback-related potential (lFRP)]. Negative feedbacks elicited a large positive deflection peaking ∼170–220 ms (eFRP) followed by a negative deflection ∼300 ms (lFRP) (supplemental Fig. S2A, available at www.jneurosci.org as supplemental material). Note that the overall shape of potentials is not strictly identical between the two animals for positive feedbacks, although major effects were reproducible. For clarity, the grand average waveform is presented in Figure 3A. All subsequent analyzes were performed individually for the two monkeys.
We analyzed feedback-related potentials by separating three types of trials—INC and CO1 for search periods, and COR. We first tested the dynamic of the difference in average amplitudes between INC and COR waves reconstructed on 20 ms bins. An ANOVA design with a built-in helmert contrast (at p < 0.05; see Materials and Methods) revealed initial and stable significant difference between INC and COR potentials at 100–120 ms and 120–140 ms for monkey S and R respectively (supplemental Fig. S2B, available at www.jneurosci.org as supplemental material). The eFRP was larger for INC than COR trials. The later component was more negative in amplitude for negative compared to positive feedbacks (Fig. 3A; supplemental Table S1, available at www.jneurosci.org as supplemental material). Note that a late positive component for positive feedback was not observed on the average waveform in one subject (supplemental Fig. S2A, available at www.jneurosci.org as supplemental material). However, a session per session analysis showed that this component emerged progressively through phase I sessions. The surface maps reconstructed for FRPs following negative feedbacks revealed comparable variations between monkeys, in time, amplitude, and topography, especially for the early positive component (eFRP) (Fig. 3B). Before subtracting the average of all electrodes from each channels (Fig. 3B,C), we verified that maps computed with the original reference presented similar spatial features. Each map in Figure 3B represents the mean values for a particular time window.
To clearly evaluate the contrast between negative and positive feedbacks we computed the difference waves between feedback types as previously applied in the literature (Yeung et al., 2005). The difference wave INC-COR for phase I confirmed, for the two monkeys, a maximum effect of valence ∼170 ms (167 ± 13 ms and 168 ± 5 ms for monkey S and R, respectively) (Fig. 3C, Table 1). Note the high similarity in peak latencies and overall time course between the two animals, as well as in the topographic distribution of the early positive peak. Maps obtained from the difference (at 170 ms) between negative feedbacks in search and positive feedbacks in repetition trials (INC-COR) presented similar topography for the two animals and evidenced a positive difference-potential more lateralized over the right hemisphere for both subjects (Fig. 3C, right).
Analyses at the peak of difference revealed for both monkeys, differences between INC and CO1, but also between CO1 and COR although only marginally for monkey S (average signal on time-windows 150–180 ms; Student's t test, INC vs CO1: t = 9.07, df = 36, p < 10–4 for monkey S and t = 5.01, df = 26, p < 10–4 for monkey R; CO1 vs COR: t = 1.94, df = 36, p = 0.06 for monkey S and t = 3.40, df = 26, p = 0.0022 for monkey R). Thus, FRPs were sensitive to negative and to positive feedbacks, but were also sensitive to the period (search vs repetition for positive feedbacks).
Effects of trial and error learning and expectations on feedback-related potentials
The difference between INC and COR reveals one aspect of the effect of valence induced by negative and positive feedbacks. However, the reinforcement learning theory of the ERN (RL-ERN) also predicts that FRPs should vary according to reward prediction error i.e., according to the level of reward expectation. In our case, expectation varies between CO1 and COR trials. The RL-ERN theory precisely predicts that unexpected positive feedbacks should have a larger effect on frontal-medial potential than expected positive feedback when compared to negative feedback (Holroyd, 2004; Holroyd et al., 2008). Contrary to predictions, the difference curve for INC-CO1 was marginally smaller at the 170 ms peak than INC-COR (Fig. 4A). Although the effect was not significant over the 3 steps for each monkey individually, the effect was consistent on step 2 for both monkeys (paired t test over sessions, monkey S: t = −4.46, p < 0.05; monkey R: t = −2.9, p < 0.05) (Fig. 4B).
Long-term follow-up of feedback-related potentials
Chronic recordings give the opportunity to observe long-term changes in brain signals and modifications reflecting learning mechanisms. To test the contribution of the visual properties of feedback stimuli per se, we evaluated the effect of changing the visual attributes of the negative feedback on the potential evoked by the SC signaling each end of repetition periods and that had the same visual properties (see Materials and Methods). Paired t tests (p < 0.05) applied on session measures in phase I returned a significant difference between eFRPs for INC and SC; the difference remained significant (p < 0.05) after negative feedback change (supplemental Fig. S3A, available at www.jneurosci.org as supplemental material). If the visual attributes were the sole cause of ERP changes then changing the negative feedback should have had the same effect on INC- and SC-related potentials. The overall evolution of the SC potential was opposite to the evolution for INC: eFRP for INC was marginally reduced after change of negative feedback whereas eFRP increased for SC (supplemental Fig. S3A, available at www.jneurosci.org as supplemental material). This supports a selective action of feedback change on INC FRP.
Measures over 10 sessions taken 7 months after the beginning of phase I revealed a stable presence of the peak of difference INC-COR (supplemental Fig. S3B, available at www.jneurosci.org as supplemental material) (peak latency: 150 ± 8 ms, amplitude: 7.36 ± 2.2 μV for monkey R; peak latency: 212 ± 33 ms, amplitude: 9.13 ± 2.5 μV for monkey S. Expressed as mean ± SD). During these sessions eFRPs continued to discriminate INC from COR trials (bin 140–160 ms, t = 7.5538, df = 18, p < 10−6 for monkey R; bin 200–220 ms, t = 6.1975, df = 18, p < 10−5 for monkey S).
Modulations under haloperidol
In humans the response-locked ERN is sensitive to dopaminergic pharmacology although the underlying mechanisms are disputed (de Bruijn et al., 2006; Jocham and Ullsperger, 2009). We thus tested whether FRPs were changed after systemic haloperidol administration. Haloperidol was given at a fixed dose of 0.01 mg · kg−1. Data acquired 30 min or 3 h after injections were analyzed separately. We observed a time-dependent effect both on reaction times and INC-COR difference waves, with stronger alterations after 3 h.
Behavior
Haloperidol injections had only weak effects on the trial and error strategy, at least as evaluated by our analyses. Both monkeys performed the task as well as in control conditions (parameters n1, n2, and P were not significantly affected by haloperidol). A drop in motivation was observed for both monkeys (Student's t test, control vs haloperidol, M: monkey S t = 4.13, df = 13, p = 0.012, and monkey R t = 5.15, df = 13, p = 0.0002). A significant effect was observed on RTs that increased following Haldol injections especially for the 3 h conditions (Fig. 5A), for which all RTs were significantly increased for both monkeys (Control vs 3 h, Student's t test, search RT: monkey S t = −9.57, df = 1511, p < 10−4 and monkey R t = −16.3, df = 6139, p < 10−4; repetition RT: monkey S t = −13.06, df = 2003, p < 10−4 and monkey R t = −20.4, df = 8415, p < 10−4).
Feedback-related potentials
The effect of haloperidol was weak and inconsistent when measured on FRP. However, a consistent significant effect of DA blockade was observed for the first peak of INC-COR difference-waves. For both monkeys, the peak amplitude in haloperidol sessions was slightly reduced 30 min after injections, and was clearly lower 3 h after injections (Fig. 5B). Permutation tests were used to evaluate the significance of decreases and confirmed the effect at 3 h for both monkeys (see statistics in Fig. 5B). A control analysis revealed that the time effect (i.e., a reduction of difference wave between the beginning and end of a session) was absent during control conditions (t test, between the first and last 45 min in control conditions; monkey S: t = −1.76, df = 16, p = 0.1; monkey R: t = −1.25, df = 18, p = 0.23).
We also tested whether the effect of haloperidol was selective of ERP related to performance feedbacks, or if haloperidol affected any evoked potentials. The literature shows that treatment with haloperidol does not alter stimulus-locked N1 potentials (Zirnheld et al., 2004; de Bruijn et al., 2006). Accordingly, we looked for effects on evoked potentials triggered by stimuli onset. There was clearly no effect on the measured negativity peaking 100 ms after target onset 3 h after haloperidol injection (permutation tests, all tests nonsignificant. See supplemental Fig. S4, available at www.jneurosci.org as supplemental material).
Discussion
We reported the first evidences of surface frontal feedback-related potentials recorded chronically in monkeys during a cognitive task. The properties of these potentials during trial and error and their sensitivity to the administration of haloperidol support their homology with FRPs observed in humans.
Medial frontal feedback-related potentials in primates
The major finding of the present work is the description of a medial frontal potential related to performance feedback in the monkey. Our initial motivation to search for this potential was to assess the sometimes criticized homology between data obtained in monkeys—especially using single unit recordings- and in humans—using EEG or functional magnetic resonance imaging (fMRI) (Botvinick et al., 2004). In the present context the functional homologies are discussed especially regarding the ACC. To address this problem, one important approach is to compare data acquired in the two species with the same technology. For instance, unit recordings in human ACC confirmed its role in processing reinforcement-related information as shown in monkey experiments (Ito et al., 2003; Matsumoto et al., 2003; Williams et al., 2004; Quilodran et al., 2008). fMRI in behaving monkeys will be a further important step to compare data between the two species. Here, we chose to address the problem using event-related potentials, a technique widely used to study performance monitoring. Note, similarities between monkey and human ERPs have been documented for other stimulus-locked potentials (Pineda et al., 1997; Woodman et al., 2007).
The FRPs that we describe in monkeys are in many ways comparable to the medial frontal event-related potentials described in humans (Donkers et al., 2005; Cohen et al., 2007). Like human ERP, monkey FRP is sensitive to alteration of dopamine transmission (Zirnheld et al., 2004; de Bruijn et al., 2006). While this in itself is interesting, the impact of dopaminergic alteration on FRPs and its prevalence compared to other neurotransmitters remains debated (Jocham and Ullsperger, 2009). Future experiments could directly address these issues.
Although we observed interindividual variability of the ERP shapes, the calculation of difference waves demonstrated a strong similarity between subjects. This reflects the consistency of one of the main characteristics of information processing within the performance monitoring system that is the discrimination between negative and positive feedbacks. Importantly, the difference waves demonstrated a peak at ∼170 ms that differs from the human reports [∼290–300 ms (Nieuwenhuis et al., 2005; Yeung et al., 2005)] by a 3/5 ratio that approximately fits the proposed rule of correspondence between human and nonhuman primates (Schroeder et al., 2004; Foxe and Schroeder, 2005).
Feedback-related potentials and expectations
In humans, FRPs follow several properties predicted by reinforcement learning rules and, in particular, a sensitivity to outcome predictability (Frank et al., 2005; Cohen et al., 2007; Eppinger et al., 2008; Holroyd et al., 2009). The description of a larger FRP for the first, unsure, correct trial compared to certain correct trials in repetition also concurs with such sensitivity to outcome predictability. Investigations on modulations by reward size and probability during the trial and error period are in progress.
It has been proposed recently that the main variance observed in the difference waves between positive and negative feedback-related ERP should be attributed to positive feedbacks and not to negative feedbacks (Holroyd et al., 2008). Based on the reinforcement learning theory of the ERN, Holroyd et al. (2008) proposed that during unexpected positive feedback the dopaminergic afferent to the ACC reduces a phenomenon otherwise observed after negative feedbacks or other task-relevant events, and hence reduces the observed surface feedback-related potential. Based on this hypothesis, one can predict that after unexpected positive feedback, the FRP should be smaller than after expected positive feedback. Several experiments in humans support this hypothesis (Holroyd et al., 2008; Martin et al., 2009). With the present protocol, the peak of difference waves INC-CO1 should be larger than INC-COR. On the contrary, we described a weak but inverse effect. Further experiments are needed to confirm this effect and possible divergence between human and monkey data. Nevertheless, data from previous neurophysiological studies in monkeys might help explain this discrepancy. Unit and local field potential recordings during the PST have revealed that ACC neurons participate in the detection and evaluation of positive and negative feedbacks (INC and CO1) when those are meaningful for adaptation i.e., during the search period (Quilodran et al., 2008). Emeric et al. (2008) recently described negative feedback-related potentials from recordings of ACC local field potentials. These data support at least a partial source of surface feedback-related potentials within the ACC. Here we show that FRPs are larger for INC and CO1 feedbacks than for COR. One interpretation is then that the different FRPs reflect the processing by different neural entities, and the activation of different ACC neuronal populations, of negative and positive feedbacks in search periods. The increased signals for CO1 and INC compared to COR might also come for nonselective processing of feedbacks in search as observed in ACC recordings (INC/CO1 neurons in Quilodran et al., 2008). However, as for unit and local field potential recordings in ACC, signals for positive feedback in COR trials are weaker.
Perspectives for chronic observations of performance monitoring
It is very likely that monkey FRPs reflect, as in humans, the integrity of the performance monitoring system (Ullsperger and von Cramon, 2006). The sensitivity to dopaminergic transmission could be explained by the modulatory role of dopamine on the cingulo-accumbens loop that is on striatal and/or cortical processing of performance feedback. It has been shown that genotypic characteristics of subjects concerning D2 receptor regulation (DRD2-TAQ-IA polymorphism) are correlated with learning strategies and with stronger activations of ACC and dorsolateral prefrontal cortex for performance feedbacks (Klein et al., 2007). Furthermore, several studies showed attenuated ERN in Parkinsonian patients, including at early stages of the disease (Falkenstein et al., 2001; Ito, 2004; Stemmer et al., 2007; Willemssen et al., 2008). However, the link between dopamine and FRP remains unresolved and its sensitivity to other neuromodulators needs to be further investigated (Jocham and Ullsperger, 2009). The chronic EEG model in monkeys allows for direct investigations of the relationships between neuromodulatory systems, ACC activity, and FRPs.
Footnotes
This research was funded by CNRS–Center National de la Recherche Scientifique, Fondation NRJ, Agence National de la Recherche Grant ANR JCJC-0048 (E.P.), and Fondation pour la Recherche Médicale (J.V.). We are very grateful to H. Kennedy and K. Knoblauch for help with this manuscript and with data analyses.
- Correspondence should be addressed to Emmanuel Procyk, Inserm U846, Stem Cell and Brain Research Institute, 18 avenue du Doyen Jean Lépine, 69500 Bron, France. emmanuel.procyk{at}inserm.fr