Associating action with its reward value is a basic ability needed by adaptive organisms and requires the convergence of limbic, motor, and associative information. To chart the basal ganglia (BG) involvement in this association, we recorded the activity of 61 well isolated neurons in the external segment of the globus pallidus (GPe) of two monkeys performing a probabilistic visuomotor task. Our results indicate that most (96%) neurons responded to multiple phases of the task. The activity of many (34%) pallidal neurons was modulated solely by direction of movement, and the activity of only a few (3%) pallidal neurons was modulated exclusively by reward prediction. However, the activity of a large number (41%) of single pallidal neurons was comodulated by both expected trial outcome and direction of arm movement. The information carried by the neuronal activity of single pallidal neurons dynamically changed as the trial progressed. The activity was predominantly modulated by both outcome prediction and future movement direction at the beginning of trials and became modulated mainly by movement-direction toward the end of trials. GPe neurons can either increase or decrease their discharge rate in response to predicted future reward. The effects of movement-direction and reward probability on neural activity are linearly summed and thus reflect two independent modulations of pallidal activity. We propose that GPe neurons are uniquely suited for independent processing of a multitude of parameters. This is enabled by the funnel-structure characteristic of the BG architecture, as well as by the anatomical and physiological properties of GPe neurons.
The hypothesis that the basal ganglia (BG) play a role in reward-modulated formation of stimulus-response associations is supported by several lines of evidence. These include animal lesion studies (Packard et al., 1989; Miyachi et al., 1997; Ferry et al., 2000) and human imaging studies (Jenkins et al., 1994; Toni and Passingham, 1999). Moreover, neuropsychological studies of human patients have shown that the BG must be intact for subjects to learn outcome probabilities predicted by different cues (Knowlton et al., 1996; Sage et al., 2003). Indeed, neural correlates of reward-modulated stimulus-response association have been found in the input (Kawagoe et al., 1998; Tremblay et al., 1998; Jog et al., 1999; Watanabe et al., 2003) and output (Handel and Glimcher, 2000; Sato and Hikosaka, 2002) structures of the BG. Activity in these structures is assumed to be modulated by reward-related activity of the midbrain dopaminergic neurons (Hollerman and Schultz, 1998; Fiorillo et al., 2003; Satoh et al., 2003; Morris et al., 2004) and striatal tonically active neurons (TANs; presumably striatal cholinergic interneurons) (Wilson et al., 1990; Aosaki et al., 1994; Ravel et al., 2003; Morris et al., 2004).
Recent data obtained in anatomical, physiological, and theoretical studies suggest that a revision is required of the role of the globus pallidus (GPe) within the general context of BG structure and function. First, whereas the recent observations mentioned above link the BG to cognitive and limbic functions, the GPe has been studied intensively in the last three decades as a purely motor nucleus (DeLong, 1971; Mink and Thach, 1987; Mitchell et al., 1987; Hamada et al., 1990; Turner and Anderson, 1997). This approach is still dominant in GPe studies, in part because of BG involvement in Parkinson's disease, the relief of the motor symptoms of Parkinsonism by pallidal lesions or electrical stimulation (Hariz, 1999; Yelnik et al., 2000), and the correlation between pallidal neuron activity and tremor (Hutchison et al., 1997; Levy et al., 2002). Second, although the classical approach (Albin et al., 1989; DeLong, 1990) regards the GPe as a relay station along the BG indirect pathway, current views consider the GPe to be a major participant in BG computation (Bolam et al., 2000). Microstimulation studies have suggested that single pallidal neurons can integrate cortical information arriving from several parallel pathways (Kita, 1992; Nambu et al., 2000). Furthermore, GPe neurons can strongly influence both BG input and output structures by their powerful GABAergic synapses (Bevan et al., 1998; Smith et al., 1998; Plenz and Kitai, 1999; Bolam et al., 2000).
A comprehensive understanding of the BG role in associating action with outcome therefore calls for a study of physiological information processing at the GPe level. Here, we focus on the role of the GPe in associating visual stimuli with action in the context of different predicted outcomes. We analyzed the neural activity of single GPe neurons during correct performance of a probabilistic visuomotor task. Special care was taken to study well isolated pallidal neurons to enable conclusions regarding modulation of the activity of a single neuron by more than one event.
Materials and Methods
General. We performed the experiments on two Macaca fascicularis monkeys (Y, male weighing 5 kg; C, female weighing 2 kg). The monkeys were trained to sit in a primate chair and to perform the behavioral task for a liquid reward. After a few months of training, the monkeys were operated under general and deep anesthesia, and an 18 mm round or 27 mm square recording chamber (monkey Y and monkey C, respectively) was attached to the skull together with a head holder. Both recording chambers were tilted 45° laterally in the coronal plane with their centers targeted by a stereotaxic device to cover the entire area of the GPe located contralateral to the monkey's working hand. The target stereotaxic coordinates were as follows: monkey Y: anterior 14.5, lateral 6, horizontal 3; monkey C: anterior 11, lateral 3.5, horizontal 1.5 (Szabo and Cowan, 1984; Martin and Bowden, 2000). The exact position of the chamber was established using a high-resolution magnetic resonance imaging (MRI) scan (Fig. 1) (Biospec Bruker 4.7 Tesla animal system, fast spin echo sequence; effective echo time, 80 msec; repetition time, 2.5 sec; 0.25 × 0.25 × 2 mm resolution). In monkey Y, the MRI scan was performed without an electrode, and recording areas were estimated only according to the borders of the saline-filled recording chamber. Identification of the recording area in monkey C was also aided by an electrode that was positioned in the center of the chamber. All recordings in monkey C were performed in the lateroposterior quadrant of the chamber. Post-surgery electrophysiological mapping was also performed on both monkeys. All procedures were conducted according to The Hebrew University guidelines for animal care and were in accordance with the National Institutes of Health Guide for the Care and Use of Laboratory Animals.
During the recording session, the monkey's head was immobilized, and four to eight glass-coated tungsten microelectrodes (impedance, 0.2-1.0 MΩ at 1000 Hz), confined within a cylindrical guide tube, were advanced separately to the GPe (Electrode Positioning System; Alpha-Omega Engineering, Nazareth, Israel). Each electrode signal was amplified with a gain of 10,000 and bandpass filtered with a 300-6000 Hz four-pole Butterworth filter (Multi-Channel Processor+; Alpha-Omega Engineering). This electrical activity was sorted and classified on-line using a template-matching algorithm (Multi-Spike Detector; Alpha-Omega Engineering). The sampling rate of spike detection pulses and behavioral events was 12 kHz (AlphaMap; Alpha-Omega Engineering).
GPe neurons were identified by the recordings of neuronal activity characteristic to the striatum (injury potentials and TANs) obtained above in the same trajectory. After identification of the dorsolateral border of the GPe, we encountered the typical electrophysiological activity of GPe neurons. This included (1) a symmetric, narrow (<1 msec) and high-amplitude spike shape, and (2) a characteristic firing pattern, with most neurons exhibiting high-frequency discharge interrupted by pauses (HFD-P) and a minority with lower frequency of discharge with occasional brief high-frequency bursts (LFD-B). The finding of spontaneous pauses in all recording sites rules out the possibility that the recording electrode had passed to the internal segment, where most neurons display high-frequency discharge but no pauses (DeLong, 1971). We recorded the activity of the first well isolated and stable pallidal unit in each penetration. The distance of the GPe neurons of this study from the ventromedial border of the striatum was <2.5 mm. Of the 61 neurons that were analyzed in this study, only a single neuron was a LFD-B (mean discharge rate, 13 spikes/sec; responding to both reward-probability and future-movement direction). All other neurons exhibited HFD-P on-line (mean firing rate, 48 ± 25 spikes/sec) (DeLong, 1971).
Behavioral task. The monkey initiated a trial by keeping its hand on the central button during the pre-cue interval indicated by a red screen (Fig. 2). After a variable period of 2-4 sec, one of four visual cues appeared on the left or the right side of the screen for a set time (monkey Y, 470 msec; monkey C, 330 msec). Each cue represented the probability of reward delivery at the end of a successful trial. After the cue disappeared, the monkey kept pressing the central button for another 2 sec of post-cue interval, also indicated by a red screen. The screen color turned yellow for the “go” signal, after which the monkey had to press one of two buttons located on either side of the central button, respective to the cue side. The monkey's response time was limited (monkey Y, 900 msec; monkey C, 750 msec). Reward (a drop of water) was given, according to the cue probability, after a 1 sec delay (reward-delay period), if the correct button was pressed. Trials were separated by 4-7 sec of blue screen, serving as an intertrial interval (ITI). In case of error, the trial was aborted and followed by an ITI. The different types of cues here are labeled according to both reward probability and direction of instructed movement; for instance, P(R)Right = 0.50 represents a cue with a 0.50 probability of reward delivery at the end of a correctly performed right button press.
Before the recordings, monkey Y had been trained for 2 months to learn the association of the four visual cues with different probabilities of getting a reward at the end of a successful trial. The probabilities were P(R) = 0.25, 0.50, 0.75, and 1.00 for the different cues. The cues appeared on the screen in a random order and random left/right location. Monkey C was trained on a P(R) = 1.00 cue for 2 months. Only 5 d before recording, she was presented with five new cues, predicting rewards at P(R) = 0.00, 0.25, 0.5, 0.75, and 1.00. However, in most of the recording sessions, monkey C was presented with only four of the five cues (pseudorandomly selected), to limit the number of different trial types. Incorrect trials were followed by the same cue up to three times to enforce behavior even in those trials in which reinforcement was absent [P(R) = 0.00] or with low probability.
In ∼5-10% of the trials, two cues appeared simultaneously (paired-cue trials), and the monkey had the choice of pressing the button with the higher or lower probability of reward delivery. These trials enabled us to evaluate the monkey's ability to associate the different cues with reward probability.
Data analysis. Two steps were taken in selecting the neurons analyzed here from the 235 recorded pallidal neurons. In the first step, only neurons with a discharge rate that was stable for at least five correctly performed trials of each type [e.g., P(R)Left = 0.75, yielding a minimum of 40 trials] were chosen (139 of 235). For rate stability analysis, the rate of discharge during the ITI period was displayed for the entire period of recording, and the largest continuous segment of stable data were selected and examined for the minimum trial criterion. Then, we only selected spike trains that were considered to originate from a single cell during real-time sorting and further confirmed by having an interspike interval histogram characteristic of a well isolated neuron. This second step was crucial to draw reliable conclusions regarding modulation of the activity of a single neuron by more than one event. If the spikes of two or more different neurons are erroneously identified as emitted by a single unit, the multi-event responses of this unit can be the outcome of the overlay of the responses from several single neurons with responses to a single event. In the end of this selection process, the database contained 61 neurons that were stable for 172 trials, on average (range, 53-359). The response properties of the cells were examined only after testing for these inclusion criteria. The behavioral and neuronal data are based on the same recording epochs. The data analysis was conducted by custom-made Matlab (MathWorks, Natick, MA) programs.
Reaction times and movement times for correct responses were quantitatively analyzed using a two-way ANOVA (reward probability × direction). Statistical results were accepted as significant at a value of p < 0.05.
To determine the number of task-related neurons, we used non-smoothed 50 msec binned peristimulus time histograms (PSTHs). The spike trains were aligned to the appearance of the visual cue, starting 2000 msec before it and lasting 4300 and 3800 msec after it, in monkeys Y (total 126 bins) and C (total 116 bins), respectively. This time period was chosen to exclude possible uncontrolled movements performed in preparation for the following trial. Because many of our recorded neurons were movement related (see below), we preferred not to include this uncontrolled portion of the trial in the neuronal data analysis. For this analysis, trials of all reward probabilities and movement direction were pooled. The baseline rate and its SD was calculated over the last 1 sec (20 bins) preceding the visual cue. A neuron was considered task related if at least three consecutive 50 msec bins deviated by at least 2 SDs from the mean baseline firing rate of the neuron.
To study which information was conveyed by each pallidal neuron, we sorted the cue-aligned spike trains (starting 2000 msec before the visual cue and lasting 4300 and 3800 msec in monkeys Y and C, respectively) according to trial type (reward probability and cue position) and computed the spike count in 50 msec bins. Thus, a distribution of spike counts in parallel time-lag bins in every trial type allowed us to perform a two-way ANOVA (reward probability × cue position; also testing for nonlinearly additive two-factor interaction): Y = YSpontaneous + YDirection + YP(R) + YDirection,P(R) + ϵ. The ANOVA test assumes that the rate of discharge (Y) is determined by the spontaneous rate of discharge (YSpontaneous) of the neuron, the effect of direction and reward (YDirection and YP(R), respectively), the nonlinear interactions between them (YDirection,P(R)), and an error term (ϵ). We considered a neuron as “reward predicting” if the p value in the ANOVA performed on reward probability was <0.05 in three consecutive bins, <0.01 in two consecutive bins, or <0.001 in a single bin. Similarly, the neuron was considered direction sensitive if a direction effect was found at the same significance levels. Finally, the strength of modulation of GPe population activity by movement-direction and reward probability in different phases of the task was tested by using the above two-way ANOVA p values. Comparison of the p value distributions in the different task phases and their deviation from uniformity was tested using the Kolmogorov-Smirnov test.
To study the frequency of nonlinearly additive interactions, we used the p value in the ANOVA for YDirection,P(R). The number of bins in which significant nonlinearly additive interactions took place was determined by the above criteria (p < 0.05 × 3 bins, p < 0.01 × 2 bins, p < 0.001 × 1 bin) and was divided by the total number of bins in which the neuron was modulated by at least one of the parameters (cue position, reward probability, or both parameters). The result reflects the relative time in which a nonlinear interaction took place out of the total time the neuron conveyed modulated information.
The tendency of pallidal neurons to monotonically increase or decrease their firing rate with increment in reward probability was tested using the linear regression method, analyzing separately the responses for left and right movements. For each single 50 msec bin in the PSTH, we fitted a regression line describing the dependence of spike count on reward probability and calculated the regression statistics. Each p value tests the hypothesis that the fitted linear slope is zero. Uniformity of p value distribution is expected in PSTHs with non-monotonically increasing or decreasing discharge rate with increasing reward probability. We compared the p value distribution in the last 40 bins of the pre-cue interval (2 sec) with the same distribution on the rest of the trial (starting from cue-display and lasting 86 and 76 bins after for each neuron in monkey Y and monkey C, respectively) using the Kolmogorov-Smirnov test (significance level, p < 0.05).
Analysis of paired-cue trials was performed by comparing the spike count distribution in the single-cue trials with the paired cue trials. For this comparison, we used a one-way ANOVA (significance level as above: p < 0.05 × 3 bins, p < 0.01 × 2 bins, p < 0.001 × 1 bin). A similar method was used for the single-cue error trials analysis.
We recorded the activity of neurons in the GPe of two monkeys (Y and C) performing a probabilistic visuomotor task (Fig. 2). In this task, the monkeys were required to release a central push-button and to press a button to the right or left, 2 sec after a visual cue was briefly displayed on the right or left side of a screen. The required direction of movement was indicated by the position of the visual cue, and the chances of receiving a reward at the end of a correctly performed trial [P(R), reward probability] were predicted by the cue identity. In a minority of the trials, two cues were presented simultaneously (paired-cue trials), and the monkey could optimize reward delivery probability by choosing to press the button associated with higher reward probability.
Analysis of the behavioral parameters indicates that the monkeys could associate the different reward values to the respective visual cues correctly (Fig. 3a-d). We reached this conclusion mainly by studying the trials in which a pair of cues was presented simultaneously. Both monkeys tended to press the button associated with the higher probability of reward (Fig. 3d). Most behavioral parameters in the single-cue trials were not affected by the reward probability (Fig. 3a-c). However, the percentage of correct trials of monkey Y tended to increase with the probability of receiving a reward (Fig. 3a), and the fraction of error trials and reaction times (two-way ANOVA, p < 0.001) were higher for the P(R) = 0 trials in monkey C. The latency between the “go” signal and the release of the central button (reaction time) (Fig. 3b) was no different across trials with different outcomes in monkey Y or for the P(R) = 0.25-1.00 in monkey C. Similarly, the time between the central button release and the press of the correct peripheral button (movement time) (Fig. 3c) was not significantly different between cues for either monkey. Previous studies have shown that predicted reward outcome influences the reaction time during a spatial delayed response task in primates, but the differences in behavioral reaction parameters are attenuated in over-trained animals (Watanabe et al., 2001). These observations fit well with our data.
In contrast to the minor differences in behavioral parameters with reward probability, a significant difference was found in reaction and movement times between trials of left and right movement in both monkeys (two-way ANOVA, p < 0.001). Both monkeys reacted faster on trials that did not require hand-crossing of the body midline (Jeannerod, 1988). Movements to the same direction were slower.
Pallidal neurons are highly responsive to the task
In this study, we analyzed data from 12 multiple-electrode recording days in monkey Y and 8 recording days in monkey C. Altogether, we analyzed 61 of 235 recorded GPe neurons from both monkeys (monkey Y, 32 neurons; monkey C, 29 neurons). These neurons were considered well isolated cells during real-time sorting, displayed a clear refractory period in their interspike interval histograms, and were stable in a firing-rate analysis for a minimum number of correctly performed trials (see Materials and Methods for details).
All the studied neurons responded to at least one phase of the task by significantly increasing or decreasing their discharge rate. Most of the neurons [59 of 61 (96%)] responded during more than one phase of the task. The number of responding neurons in each phase is summarized in the first column of Table 1.
Pallidal neurons dynamically convey information regarding reward probability and direction of required movement
To study which information was conveyed by each pallidal neuron, we performed a two-way ANOVA (reward probability × movement direction) on parallel-lag 50 msec bins of each trial type for the full trial duration until the reward delivery. Pallidal neurons can be divided into four subpopulations according to the modulation of their task-related activity. Modulation solely by direction of movement was observed in 34% of the neurons (11 of 32 neurons in monkey Y and 10 of 29 neurons in monkey C). The activity of only two neurons [one in each monkey (3%)] was modulated exclusively by reward prediction. The activity of 41% of the neurons (18 of 32 neurons in monkey Y and 7 of 29 neurons in monkey C) was comodulated by both direction of movement and reward prediction (Figs. 4, 5). Although all recorded neurons were task related, a fraction of them [13 of 61 (21%); monkey Y, 2 of 32; monkey C, 11 of 29] did not show modulation of response by either direction of movement or by reward prediction in any phase of the task. Aligning the spike trains to the beginning or end of hand movement did not change these results.
Interestingly, some comodulated neurons encoded different parameters during different epochs of the task. Neurons that displayed reward-predicting activity in one phase of the task acted as direction-sensitive neurons at later phases (Fig. 4). Other neurons conveyed information about these two parameters at the same time (Fig. 5). The number of neuronal responses modulated by the different parameters in each part of the task is shown in Table 1. Note the large number of responses modulated by both reward probability and cue position at the beginning of the trial, which evolved to purely direction-sensitive responses with trial progression.
Because the pallidal population displayed modulation of activity by two parameters throughout the task, we tested the population coding of these two parameters. We studied the population coding using the p values (as calculated by the two-way ANOVA). Each p value tested the hypothesis that the activity of a single neuron was modulated in a specific time bin either by movement direction or by predicted reward. To examine the population, we grouped the p values of all 61 pallidal neurons, pooling over all bins within each trial phase. This process was performed separately for movement direction (Fig. 6a) and for reward prediction (Fig. 6b). As expected, the p values were uniformly distributed for both direction and reward probability during the pre-cue phase of the trial (in which no information was given regarding the required movement or the likelihood of reward). However, in all other trial phases, the distributions lost their uniformity and were significantly different from the pre-cue phase (Kolmogorov-Smirnov test, p < 0.001), implying modulation by both trial parameters (direction and reward probability). Figure 6 also demonstrates the dynamic changes in the strength of modulation along the different trial segments. The direction of hand movement primarily affected the neuronal activity during movement (Fig. 6a, dashed trace). The outcome, however, exerted the strongest effect during the cue display (Fig. 6b, dotted trace).
Monotonic change in pallidal activity with increment of reward probability
Dopamine neurons elevate their firing rate when information on upcoming reward is received (Hollerman and Schultz, 1998). Furthermore, this elevation in firing rate increases with the increment in future reward probability (Fiorillo et al., 2003; Morris et al., 2004). In contrast, the rate of discharge of pallidal neurons can either increase or decrease, displaying three distinct patterns of modulation by reward probability. Information regarding future reward was observed in the activity of 18 pallidal neurons during periods of increased firing rate. This increment above spontaneous firing rate (calculated in the pre-cue period) was higher as the probability of reward delivery increased. An example of such a neuron is shown in Figure 4 during the cue display. In 20 neurons, the opposite pattern was observed; a deeper trough in their firing rate was noted with the increase in reward probability (Fig. 5, cue-display). This suggests that the deviation (positive or negative) from the background discharge rate encoded the probability of future outcome in the majority of GPe neurons. We also found five bipolar responses, i.e., increased firing rate in trials with high likelihood of reward and a decrease in firing rates when reward was less likely (Fig. 5, reward-delay), or vice versa. As can be seen from the example in Figure 5, different modes of response can be observed in a single neuron at different trial epochs, and therefore the above-mentioned groups of neurons are not mutually exclusive.
We tested the relationship between the neuronal response and reward probability using the linear regression method. Note that a linear relationship is not always the optimal way of describing the relationship between these two parameters. For example, the neuron in Figure 4 responded in a similar way to the P(R) = 0.75 and P(R) = 1.00 cues and thus is a poor candidate for linear modeling. However, we used the linear model as a conservative indicator of a monotonic change (increment or decrement) in firing rate in relation to reward probability. For each reward-predicting neuron (comodulated and pure reward modulated), we fit a regression line to the spike counts in each time bin as a function of reward probability. This procedure was performed separately for each neuron, and the p values of the resulting regression were pooled over all time bins within the behavioral epoch and grouped over all neurons to describe the population (Fig. 7). The distribution of p values of the linear regression in the pre-cue interval was uniform, as expected. This result confirms that the smaller number of correct trials in trials with low reward probability did not bias the regression line. However, the situation was different for the bins throughout the rest of the trial (from cue-onset and thereafter). Here, the distribution contained a larger than expected number of small p values. These two distributions (pre-cue and rest of the trial) are significantly different (Kolmogorov-Smirnov test, p < 0.001). These results imply that the population of GPe neurons tends to monotonically modulate (increase or decrease) its firing rate with the change in reward probability.
Reward probability and direction of movement independently modulate pallidal activity
The two-way ANOVA used above for the evaluation of the dynamic changes in the information coded by pallidal activity also allowed us to test for nonlinearly additive effects (two-factor interaction). This was done in the 25 neurons that were comodulated by both movement direction and reward probability (Figs. 4e, 5e). For each of these neurons, we computed the relative time in which nonlinear interactions took place out of the total time the neuron conveyed modulated information. This was conducted by dividing the number of time bins in which the interaction p values were significant by the total number of time bins in which the neuron conveyed significant information on cue position, reward probability, or both. We found only four neurons with significant interactions; in all other neurons, bins with significant interactions were not observed. In these four neurons, the relative time in which significant interactions took place out of the total time in which significant information was conveyed was 0.12 ± 0.06 (mean ± SD; range, 0.08-0.22). In the other 21 comodulated neurons and the 36 neurons that were not comodulated, interactions did not take place, and thus this value was 0. These results indicate that although a single pallidal neuron can convey information regarding two parameters, the firing rate is independently modulated by the two parameters.
Response of pallidal neurons in paired-cue trials
Pallidal neurons were also studied in the few trials (5-10% of the total trials) in which two cues were presented simultaneously (paired-cue trials). A sufficient number of trials were found in only eight GPe neurons, all of which responded to both reward probability and direction of movement. Each neuron was studied under one to three possible options of cue pairs [e.g., P(R)Left = 0.50 with P(R)Right = 1.00] and only in trials in which the monkey moved to the direction associated with the higher reward probability (the majority of trials) (Fig. 3d).
We compared the response of each neuron to a single-cue presentation with its response to the same cue presented simultaneously with another cue. Comparison was performed using ANOVA on the distribution of spike counts in parallel time-lag bins in each of these situations (Fig. 8). Most (seven of eight) neurons responded in the same way for the cue representing higher probability for future reward, whether presented alone or simultaneously with a cue representing lower reward probability. In all of these neurons, the response for the lower reward probability cue was significantly different between the single and paired trials. These significant differences were observed during cue and post-cue periods (five of seven) and during movement (six of seven). A single neuron failed to show a significant statistical difference between the two single-cue presentations and the paired cue trials. Although caution is needed when drawing a conclusion from such a small sample, this result indicates that the activity of GPe neurons is not affected by the irrelevant cue. Rather, in the analyzed trials, as in the single-cue trials, the GPe activity was modulated by the selected direction of arm movement and the monkey's predictions of future reward probability.
Response of pallidal neurons in single-cue error trials
We examined the single-cue trials in which the monkey wrongly chose to go to the opposite side of the instructed cue [direction-error (DE) trials]. This analysis was performed only on the 12 neurons that were direction sensitive (direction and reward comodulated neurons or direction-sensitive only neurons) and was recorded for at least five error trials for one or more of the cues. Using ANOVA, we compared the activity of these neurons in DE trials with either same-cue (SC) successful trials or with same-movement (SM) successful trials. For example, we compared the DE trials of P(R)Right = 0.25, followed by left movement with the respective SC trials [P(R)Right = 0.25; right movement] and the SM trials [P(R)Left = 0.25; left movement]. In four neurons, we found a significant difference in responses between DE and SM trials during the cue and the first half of the post-cue period (“cue-position encoding”). A single neuron had a significant difference in response between DE and SC trials during cue presentation. We also found that the responses of five of the neurons during the second half of the post-cue and movement periods differed significantly between the DE and the SC trials. Three of these five “movement encodings” during the late phase of the trial neurons encoded the cue position in the early phase of the trials.
Although these results are based on a limited number of neurons, it seems that in the beginning of the trial the neurons encode the instructed direction; however, as the trial progresses, the actual movement direction is coded. This is in line with the possibility that throughout the trial the intended movement direction is coded. Additional studies of GPe activity may enable identification of the exact moment at which the monkey makes the decision to explore the possible gain of arm movements toward the opposite position.
Our study indicates that single neurons in the GPe may serve as an integration site for different behaviorally relevant neuronal circuits. We showed that well isolated pallidal neurons convey information regarding both direction of required movement and predicted gain from the action. Unlike dopaminergic neurons (Fiorillo et al., 2003; Morris et al., 2004), GPe neurons either increase or decrease their rate of discharge with increment in future reward probability. Moreover, many neurons dynamically changed their response properties as trials progressed from reward predicting to direction sensitive; nevertheless, neurons that encoded these two parameters simultaneously revealed independent encoding of the two parameters.
Anatomically, GPe neurons are placed in a position for integrating information arriving from the two distinct BG input structures, the striatum and the subthalamic nucleus (Kita, 1992; Nambu et al., 2000). The reduction in the number of neurons between the striatum and the GPe implies convergence of striatal neurons onto single pallidal cells (Percheron and Filion, 1991; Oorschot, 1996). The large number of responding neurons in this study and their activation during more than one phase of the task is in line with the hypothesis of information convergence at the GPe level (but see Nishino et al., 1985). The complexity of the behavioral task, involving reward probability learning and memory, visual perception, spatial working memory, and movement planning and execution, is likely to activate a large number of cortical areas. Convergence of these functional circuits in the GPe is expected to launch activity in many pallidal cells.
Convergence of limbic and motor information in a major input structure to the GPe, the striatum, is still under debate. Although some studies report relatively high numbers of neurons with converging representation of limbic and motor information (Kawagoe et al., 1998), other studies report a much lower degree of convergence and a lower number of neurons that are activated in multiple phases of the task (Cromwell and Schultz, 2003). A recent study even found motor and limbic striatal neurons to be completely composed of two separate populations (Schmitzer-Torbert and Redish, 2004). These differences in results can be attributed to many factors (e.g., behavioral paradigm, recording areas, and isolation quality) but, together with this study, may indicate more convergence at the GPe than in the striatum level.
In the current study, we show that pallidal neurons, conveying information regarding movement-direction along with expected outcome, respond to each of these parameters independently. This independence was reflected in the lack of significant interactions between the effects of the two parameters on the discharge rate of the neurons. Movement direction and trial outcome were encoded by single neurons, either simultaneously or at different times. However, in the vast majority of cases, the modulation by the two parameters was statistically independent. The capability for independent coding may have been aided by the typical architecture of pallidal neurons, the dendrites of which extend to functionally distinct input-receiving zones (Yelnik et al., 1984). Furthermore, the spatial distance between distant dendrites attenuates electrotonic coupling (Rall, 1964) and thus promotes linear summation of functionally distinct inputs. Such a model, primarily based on anatomical studies, has previously been presented for the GPe (Percheron et al., 1984) as well as for the output nuclei of the BG (Percheron et al., 1984; Bevan et al., 1997).
We do not claim that pallidal neurons are dedicated to encoding the two parameters investigated in this study (direction and outcome). It is likely that a large number of other unmonitored parameters modulate the activity of these neurons. Rather, we suggest that pallidal neurons are linear integrators modulated by a large number of different parameters. This can be attributed to two physiological characteristics of these neurons. First, intracellular in vivo (Kita and Kitai, 1991) and in vitro (Nambu and Llinas, 1994) studies of GP neurons suggest that pallidal neurons increase their firing rate almost linearly with membrane depolarization in a large range of membrane potentials. Second, pallidal neurons are unique in their high spontaneous discharge rate and therefore in their large (negative as well as positive) dynamic range (Kita and Kitai, 1991). Taken together, these results imply that pallidal neurons are capable of linearly summating a large number of different excitatory and inhibitory inputs.
The mode of information integration in the BG has been debated for years. On the one hand, a model of parallel functionally distinct cortico-BG-thalamocortical circuits (Alexander et al., 1986; Alexander and Crutcher, 1990) has found support in anatomical studies (Hoover and Strick, 1993) and in electrophysiological studies, demonstrating that pallidal neuron activity is related to a single joint movement (DeLong et al., 1985; Hamada et al., 1990). This view contrasts with the approach that sees the BG as an integrative system (Percheron and Filion, 1991; Bevan et al., 1996). The BG funnel-shaped architecture (reduction of several orders of magnitude in the number of cells along the cortico-striato-pallidal axis) imposes some degree of convergence along this axis (Oorschot, 1996; Bar-Gad et al., 2003). Furthermore, anatomical studies have suggested information convergence from distinct cortical areas (for review, see Bolam et al., 2000). The findings reported here support the view of functional convergence at the GPe level. This further suggests that the GPe is a crucial station of information processing in the BG and thus is a major participant in stimulus-action-outcome association.
The GPe is part of the cortico-BG-thalomocortical loop. Many locations of convergence may be found along this loop, as reflected by electrophysiological recordings. Outcome-modulated activity during movement has been documented in the input structures of the BG (Kawagoe et al., 1998; Hassani et al., 2001), the output structures (Gdowski et al., 2001; Sato and Hikosaka, 2002), as well as in cortical areas (Watanabe, 1996; Shima and Tanji, 1998; Leon and Shadlen, 1999; Platt and Glimcher, 1999; Wallis and Miller, 2003; Sugrue et al., 2004). In light of its anatomical and physiological properties, the independence of motor and limbic information flow through the GPe suggests that this nucleus is a primary site of information convergence in this circuit.
This work was supported in part by a Center of Excellence grant administered by the Israel Science Foundation and by the German-Israel Binational Foundation and the German-Israeli cooperation in the neurosciences administered by the Israel Ministry of Science and Technology and the Federal Ministry of Education and Research. D.A. was supported by the Faye Kaufman Prize, and G.M. was supported by a Horowitz fellowship. We thank J. O. Dostrovsky, Y. Ritov, and N. Parush for fruitful discussions and comments on previous versions of this manuscript, G. Goelman for MRI, and V. Sharkansky for technical assistance.
Correspondence should be addressed to Dr. David Arkadir, Department of Physiology, Hadassah Medical School, The Hebrew University, P.O. Box 12272, Jerusalem 91120, Israel. E-mail:.
Copyright © 2004 Society for Neuroscience 0270-6474/04/2410047-10$15.00/0