Abstract
Few studies have investigated how the cortex encodes the preshaping of the hand as an object is grasped, an ethological movement referred to as prehension. We developed an encoding model of hand kinematics to test whether primary motor cortex (MI) neurons encode temporally extensive combinations of joint motions that characterize a prehensile movement. Two female rhesus macaque monkeys were trained to grasp 4 different objects presented by a robot while their arm was held in place by a thermoplastic brace. We used multielectrode arrays to record MI neurons and an infrared camera motion tracking system to record the 3-D positions of 14 markers placed on the monkeys' wrist and digits. A generalized linear model framework was used to predict the firing rate of each neuron in a 4 ms time interval, based on its own spiking history and the spatiotemporal kinematics of the joint angles of the hand. Our results show that the variability of the firing rate of MI neurons is better described by temporally extensive combinations of finger and wrist joint angle kinematics rather than any individual joint motion or any combination of static kinematic parameters at their optimal lag. Moreover, a higher percentage of neurons encoded joint angular velocities than joint angular positions. These results suggest that neurons encode the covarying trajectories of the hand's joints during a prehensile movement.
Introduction
Whereas a motor act like playing the piano often requires independent control of many of the 20 degrees of freedom of the hand, prehension—preshaping the hand to grasp an object—involves a small set of synergistic movements of the segments of the hand. As such, studies often characterize prehension in a lower dimensional space either by considering grip aperture alone or by reducing the dimensionality using a small set of principal components on the joint angle kinematics (Paulignan et al., 1991; Mason et al., 2002, 2004; Santello et al., 2002; Todorov and Ghahramani, 2004; Hendrix et al., 2009).
Though extensive psychophysics studies suggest that the CNS controls prehension with a limited set of kinematic synergies, few studies in cortex investigate how neurons might encode these synergies. Instead, previous motor cortical encoding studies have shown that the discharge rates of MI neurons, in large time windows (∼150 ms), vary with discrete states such as whether a finger joint is flexed or extended, grasp postures, or object categories (Schieber and Hibbard, 1993; Georgopoulos et al., 1999; Poliakov and Schieber, 1999; Umilta et al., 2007; Aggarwal et al., 2009). Recently, one study showed that single MI neurons vary their discharge rates in time with respect to grip aperture and force and another showed that populations of neurons simultaneously decode the joint kinematics of the arm and hand during reach-to-grasp movements (Hendrix et al., 2009; Vargas-Irwin et al., 2010).
Here, we extend these previous studies by describing how single MI neurons encode the temporal evolution of a set of joint kinematic features of the hand during prehension. In particular, we demonstrate that single MI neurons encode a synergistic set of temporally extensive joint trajectories of the fingers and wrist. We used a generalized linear model framework to predict each neuron's response based on the finger and wrist kinematics as well the spiking history of the neuron (Truccolo et al., 2005, 2008; Hatsopoulos et al., 2007). These spike history terms may capture intrinsic temporal dependencies in the neuron's response (Pillow et al., 2008; Stevenson et al., 2009). We show that both kinematic and spike history terms contributed significantly to the predictive power of the majority of neurons. The model's predictive power significantly increased when we replaced the kinematic parameters at their optimal lag by temporally extensive kinematic trajectories. A larger percentage of neurons encoded joint velocities compared with angular positions and more neurons encoded flexion/extension movements about the wrist and the metacarpophalangeal joint of the index finger compared with other kinematic trajectories. In 51% of neurons, 10 of 12 trajectories were required to reach the maximum predictive power of the kinematics which indicates that neurons tend to encode more than one joint angle parameter. We also tested the contribution of each principal component where each component represented a linear combination of the kinematic trajectories of the joint angles: in a majority of neurons, one or two principal components accounted for all the predictive power of the kinematics which suggests that neurons encode a low dimensional representation of the joint trajectories that characterize prehension.
Materials and Methods
Behavioral task
Two subjects (female rhesus macaques, monkey O 6.6 kg, monkey A 6.5 kg) were trained to grasp objects with their right hand. The arm was constrained with a thermoplastic brace, at 170° extension in the elbow joint, and 80° elevation (abduction) in the shoulder joint. A grasp trial consisted of 4 different periods: a premovement period, a movement period preceding the grasp, a hold period when the subject held the object, and a release period when the robot retracted the object (Fig. 1A). At the start of a trial, the robot moved the object toward the subject's hand. The trajectory and speed of the robot were designed to approximately simulate actual hand transport. Trajectories were fine tuned so that the object approached the subject's hand in the same orientation that the subject's hand approached the object in a free reaching condition. The subject preshaped her hand to grasp the approaching object. The subject was instructed to hold the object until the robot retracted it. Three objects were presented, requiring a total of 4 hand configurations (Fig. 1B). The first configuration was a key grip, where the monkey held a thin flat object between the thumb and the side of the middle phalanx of the index finger; the second object, the D-ring, was presented twice, in configurations that elicited a whole hand grip (power) at two different wrist orientations. These two different configurations will be treated as separate objects—D-ring horizontal and D-ring vertical—in the rest of this paper. The third object, a sphere, elicited a whole hand grip with a fanning of the fingers. These objects were designed to evoke a variety of coordination patterns among the joint angle kinematic trajectories. Each object was presented to the subject in a block of at least 30 trials.
Neural data acquisition
Each monkey was chronically implanted with a 100 electrode (400 μm interelectrode separation, 1.5 mm electrode length; BlackRock Microsystems) microelectrode array in the arm-hand area of the primary motor cortex in the left hemisphere, contralateral to the hand they used to grasp the objects. The array was placed parallel and as close as possible to the central sulcus, medial from the spur of the arcuate sulcus (Fig. 1C). Surgical details have been described previously (Rousche and Normann, 1992; Maynard et al., 1997, 1999). During each recording session, signals were amplified (gain 5000×), bandpass filtered between 0.3 and 7.5 kHz, and digitized at 30 kHz using the Cerebus Neural Data Acquisition system (BlackRock Microsystems). For each channel, a threshold was set above the noise band: if the signal crossed the threshold, a 1.6 ms window of the signal—as to yield 48 samples given a sampling frequency of 30 kHz—was sampled around the occurrence of the threshold crossing. To classify the waveforms as noise or units on the given channel, the waveforms were spike-sorted offline using Offline Sorter (Plexon) and/or exported to Matlab (MathWorks) and spike-sorted using custom software.
Motion tracking
The three-dimensional (3-D) positions of monkeys' wrists, hands, and fingers were recorded using a video-based motion analysis system (Vicon Motion Tracking System, Workstation 460). We used six M2 cameras at 1.2 megapixel resolution. Fourteen spherical retro-reflective markers (3 mm diameter) were glued to the monkeys' fingers, hand and forearm (Fig. 1A). Their 3-D positions were used to calculate the angles of the following joints: wrist flexion/extension (WR), the metacarpophalangeal joint of the index finger (IMP), the proximal interphalangeal joint of the index finger (IPIP) and ring finger (RPIP), the angle between the proximal interphalangeal joints of the index and ring fingers with the metacarpophalangeal joint of the middle finger as the origin (FAN), and, last, the angle between the metacarpals of the thumb and index finger with the vertex at the base of the radius (TI). The angle TI is projected on the plane that is formed by the metacarpal bone of the index finger and the metacarpal bone of the thumb. A custom software package (TheGame, version 2) was used to remotely control the trial capture and align the kinematics data with the neural data via a synchronizing pulse that was sent to both systems every 150 ms. The kinematics data were sampled at 250 Hz and bidirectionally filtered with a fourth order Butterworth lowpass filter with 6 Hz cutoff. All data filtering and calculations were done in MATLAB. The start of movement and the time of grasp were determined by analyzing the kinematics—the start of movement corresponded to the time at which the first marker's speed profile exceeded 10 cm/s. To determine the grasp time, we calculated the time at which the last marker's speed profile dropped below 10 cm/s.
Computational model
We developed an encoding model to predict the spiking response of each neuron based on a set of external and internal covariates. The external covariates consisted of the kinematic trajectories of a group of finger and wrist joints. The internal covariates included the history of spiking responses of the neuron on multiple time scales. These covariates may capture effects such as intrinsic temporal correlations due to refractoriness, bursting, rhythmicity, and short-term plasticity. We used a generalized linear modeling framework where the conditional intensity function λt, the neuron's instantaneous spike count in a 4 ms time bin, is estimated as a function of a set of covariates—the joint kinematics and the neuron's firing history. A generalized linear model has the advantage over standard multiple linear regression in that it is designed for discrete, integer responses (i.e., numbers of spikes emitted in a time bin) that are not normally distributed as assumed by standard regression. We linked the rate function to the covariates through a log function, to ensure non-negative estimated rates. That is, the logarithm of the firing rate of the neuron is a linear function of the covariates. More details about the application of the generalized linear model to neural data can be found previously (Paninski et al., 2004a; Truccolo et al., 2005).
Spike history.
Setting aside the effects of external covariates, the current spiking response of a neuron depends on the neuron's spiking response in the past. Such spike-history effects may include short time-scale phenomena such as refractoriness and bursting and longer time-scale oscillations, facilitation and depression phenomena occurring on the order of tens to hundreds of milliseconds (Hille, 2001; Keat et al., 2001; Chen and Fetz, 2005). To model these spiking history components, we filtered the spike train with a basis set of raised cosines of the form: for t, such that a log(t + c) ∈ [ϕj − π, ϕj + π], and 0 previously (Pillow et al., 2008). The ϕj are the x-axis positions of the peaks of the cosine curves. Their positions were selected as equidistant on the regular x-axis scale—that is, before applying the log transform, the basis centers were equidistant from each other. After the log transform, the basis centers were at 8 ms, 12 ms, 20 ms, 32 ms, 60 ms, 108 ms and 208 ms respectively. The resulting basis set bj, where j ∈ [1,7] accounts for both fine temporal correlations near the spike and broader correlations at longer time lags from the spike. Each basis vector was convolved with the spikes ρ(t) to give rise to J = 7 spike history vectors Hj, representing the effect of spike history up to 200 ms in the past: where T is the duration of the basis vector.
The following spike history model is used as a control model excluding all external covariates: The βjH terms represent the weights associated with the history terms and β0 can be taken to be the baseline firing rate of the neuron.
Kinematic variables.
To model velocity and position together with spike history, we used the following model: where r represents the time lags between the kinematics and the spiking, ranging from −164 ms to 200 ms after the spiking activity, with a step size of 52 ms, f denotes a particular kinematic feature, and F is the total number of angular or angular velocity features (F = 6; θ is the joint angle and θ̇ is the joint angle velocity). Each term associated with a coefficient β is a covariate. The total number of kinematic terms is, therefore, 96 = (6 angles + 6 angular velocities) × 8 time lags. In the remainder of the text, we use “trajectories” to denote joint angles or angular velocities of the fingers and wrist at multiple time lags relative to the spikes. The spike history terms and kinematic terms on the right hand side of the equation are also referred to as covariates of the conditional intensity function.
Assessing the significance of the kinematic terms at each lag.
For each neuron, we assessed the significance of the kinematic terms at each lag by using a goodness-of-fit test: the χ2 test on the deviances between two models. The first model was the full model in Equation 4. The second model was the same as the full model but with the tested term removed. As such, the second model's terms were a subset of the terms in the full model. When testing two models within the generalized linear model framework, one of which has a subset of the terms of the other model—a model that is “nested” in the first, the difference in the deviances between the two models follows a χ2 distribution (McCullagh and Nelder, 1989). An increase in deviance from the full model to the nested model would indicate an increase in fitting error due to the exclusion of the tested term. We used the χ2 test to assess whether the change in deviance was significant at a p-value smaller than 0.05.
Principal components.
We took advantage of the fact that the kinematic terms are correlated during grasping and computed the principal components of all the 96 kinematic terms to reduce the number of parameters in the model. The kinematic data were sampled in 4 ms bins, where one observation corresponded to the trajectories of all the kinematic parameters (96 terms). The next observation corresponded to the same parameters sampled 4 ms after the samples from the previous observation such that total number of observations corresponded to the number of trials multiplied by the number of 4 ms samples per trial (Hatsopoulos et al., 2007). We computed the principal components of the 96 kinematic variables for these observations such that the last two terms from the right hand side of Equation 4 were replaced by the number of principal components that accounted for 90% of the variance. The number of principal components ranged between 10 and 15 principal components (P in Eq. 5) depending on the dataset. The covariates PC in Equation 5 refer to the first P principal components; that is, the projections of the kinematic data onto the first P eigenvectors. We refer to the “full model” as a model that included all the terms in the right hand side of Equation 5.
Prediction accuracy.
To test for the prediction accuracy of a model, we computed the area under the receiver operating characteristic (ROC) curve (AUC) for all test data, using 10 folds of cross-validation (i.e., 10 distinct sets of test data that were not used to build the model). The receiver operating characteristic curve is a plot of the true-positive versus false-positive rates of prediction. To calculate true-positive and false-positive rates at different thresholds, we started by aligning the real spike counts with the predicted spike counts. We then looped through a set of thresholds that matched the range of conditional intensity estimates (the predicted data). The conditional intensity estimate represents the firing rate estimation in a 4 ms bin and typically ranged from 0 to ∼0.2. A value of 0.2 would correspond to a firing rate of 50 Hz. We looped through 50 equidistant thresholds between 0 and the maximum value of the conditional intensity estimate. For each threshold, we calculate the total number of positives, namely the number of timestamps for which the estimated conditional intensity was above the threshold. Of these, the hits or true positives correspond to actual spikes, and the rest are false positives. Each point in Figure 3D (see below) corresponds to one of these calculations; that is, one false-positive and true-positive reading at a given threshold for one cross-validation fold. For each fold, we computed the AUC. Numerically, this means we connected the dots associated with one fold in Figure 3D and computed the piecewise trapezoidal area between each set of two consecutive dots. These areas were summed to yield the total area under the ROC curve for each fold. The AUC provides an assessment of the predictive power of the encoding model and ranges between 0.5 and 1.0, where 0.5 represents chance level prediction and 1.0 is a perfect prediction (see Fig. 3D). Specifically, when two different samples are randomly drawn from the data (one sample contains a spike and the other does not), the AUC represents the probability that the model will yield a higher probability for the sample with a spike. More details about ROC curve analysis can be found in other articles (Fawcett, 2006; Hatsopoulos et al., 2007; Truccolo et al., 2008, 2010). We used a Wilcoxon signed rank test to compare the AUCs of neurons for different tested models. If the test involved multiple comparisons, we applied a Bonferroni correction to the p-values.
Results
We collected two datasets that were 10 days apart per subject, where each dataset included at least 30 grasps to each of the 4 objects. In the first monkey, we found 23 of 41 and 22 of 39 task-related MI neurons for the first and second experiments respectively; in the second monkey, we found 20 of 35 and 13 of 17 task-related MI neurons. A neuron was considered to be task-related if its mean firing rate during the movement period was significantly different from its mean firing rate during the premovement period (paired t test across trials, p < 0.05). In the remainder of the text, when referring to neurons, we mean cell samples that were task-related. We have previously shown that for a given subject implanted with a Utah array, 43% of units are stable across 10 d on average (Dickey et al., 2009). Taking this estimation into consideration, we expect that in the same animal, ∼9 task-related cell samples overlapped from one experiment to the next. Firing rates varied across a trial, across objects and across neurons (Fig. 2A–D). The starting postures, the end postures, and the kinematic trajectories varied across object task conditions (Fig. 2E). IMP (green), IPIP (blue) and RPIP (black) show patterns that reflect an extension and flexion of the proximal and middle phalanges of the fingers. Flexion of the fingers is least pronounced for the sphere object indicating that the end posture had the hand in a semi-open configuration. TI (red) is the angle between the metacarpal bones of the thumb and index finger with the origin at the base of the radius and represents the abduction/adduction of the thumb. The thumb was abducted most prominently when the monkey grasped the vertical D-ring. This subject did not tend to oppose her thumb to the rest of the fingers when grasping the sphere, horizontal D-ring, or key-grip object. Instead, when grasping these objects, she tended to keep her thumb adjacent to the metacarpal bone of the index finger. FAN (orange) is an analog for the abduction/adduction of the proximal phalanges. The monkey fanned her fingers to grasp both the sphere and vertical D-ring but not the other two objects. WR (pink) represents wrist flexion and extension. The robot brought the objects to the monkey's hand in a manner that prevented the need for an extension of the wrist before the grasp. This was a necessary precaution to prevent any injuries of their hands in the event that they would not grasp the object.
Prediction results
The encoding model accurately predicted the precise timing and firing rates of the cells at a 4 ms resolution. The spiking responses as well as the predicted responses from the model for an example neuron are shown for each of the four objects (Fig. 3A–C). The predictive power of the model for this neuron—the area under the ROC curve (AUC)—yielded a value of 0.79. On a population level, the median predictive power of the model for all neurons was 0.7608 which was significantly above chance (Fig. 3E, Wilcoxon signed rank test—rankWil—to assess whether the AUCs of all neurons come from a distribution whose median is 0.5, p < 0.001).
Spike history
We used 7 basis functions to model the effect of a neuron's own history at different time scales (Fig. 4A,B) (see Materials and Methods). We tested whether particular spike history terms were encoded by each neuron. To test the significance of each history term for a neuron, we compared the full model against one with the test history term removed. A significant decrease in AUC indicated that the given history term was significant for that neuron. Figure 4C displays the results of this analysis for a given neuron: for this neuron, removing the first and third spike history terms (Fig. 4A,B; B1 in green and B3 in cyan) yielded a significant decrease in predictive power (paired rankWil, p < 0.05). We found that between 29% and 68% of neurons encoded each of the seven history terms (Fig. 4D, dark blue). We also examined whether history effects differed when no movement was occurring. We recomputed the encoding model using data that were collected during the premovement period. The proportion of neurons with significant history effects on a short time-scale (i.e., first through third history terms) did not differ substantially during the premovement period. In contrast, there was at least a 45% drop in the number of neurons that encoded the fourth, sixth and seventh spike history terms (Fig. 4D, light blue) though each history term was encoded by at least 21% of the neurons.
Kinematics
We tested whether neurons encode temporally extensive, kinematic trajectories compared with kinematic parameters measured at a single time lag with respect to the spike times. To do this, we first built an encoding model using single time lags to determine the optimal lag time for each neuron assessed with the goodness-of-fit test (see Materials and Methods). Figure 5, A and B, displays the distribution of optimal lags for all neurons, calculated for the joint angle velocities and joint angle positions respectively. A plurality of neurons yielded the lowest p-value (see Materials and Methods, χ2 test on deviance) at a positive lag of 148 ms or 200 ms. A positive lag means the kinematics lag the neural activity whereas a negative lag indicates the kinematics lead the neural activity. We then built a model by cumulatively adding neighboring lags, starting at a lag of 200 ms, up to a negative lag of −164 ms yielding a largest trajectory length of 364 ms. Figure 5C displays the mean difference in predictive power between the tested trajectory length and the trajectory length at which the maximum predictive power was achieved. The trajectory lengths at which the lowest absolute values on this graph are shown represent the lengths at which the highest predictive power is achieved. The error asymptotically converged to 0 at a path length between 312 ms and 364 ms. Since the errors at trajectory lengths of 312 ms and 364 ms were significantly different (paired rankWil, p = 0.0092), we concluded that a trajectory length of 364 ms was optimal. Next, we compared the predictive power of an encoding model with the 12 kinematic terms sampled at a trajectory length of 364 ms to one with the same terms sampled only at their optimal lag (Fig. 5D). In 90% of the neurons, the AUC was significantly higher for the full model compared with one with spike history terms and the kinematics at just their optimal lags (paired rankWil, p < 0.05). The difference in AUC between these two models represents the added predictive power when incorporating the 364 ms trajectory. At a population level, the AUC was higher for the full model compared with a model with the kinematics covariates only at their optimal lags (paired rankWil, p < 0.001).
We also tested whether particular kinematic trajectories were encoded by each neuron. Each kinematic trajectory will be referred to as a feature. To test the significance of each feature for a neuron, we compared the full model against one with the test feature removed. A significant decrease in predictive power indicated that the given feature was significant for that neuron. As an example for a single neuron, Figure 6A displays the mean AUC for the full model in dark gray and the mean AUC for a model with one of the 12 features removed at a time in blue. The feature that was removed from the model is indicated on the x-axis. For this given neuron, removing either of WRv, IMPv, IPIPv, TI, TIv, or WR yielded a significant decrease in predictive power (paired rankWil, pWRv = 0.002, pIMPv = 0.002, pIPIPv = 0.003, pTI = 0.037, pTIv = 0.041, and pWR = 0.049). All features were represented though IMPv and WRv were more frequently represented than other features and joint angle velocities as a whole were also 3 times more frequently represented than joint angle positions (Fig. 6B). Across the entire population of neurons, a median of 3 features were significant per neuron (paired rankWil, p < 0.05 in all cases; Fig. 6C).
We then tested how many features were required to attain the full model's predictive power for each of the neurons. To do so, we cumulatively added features to the model until its predictive power was not significantly different from the predictive power of the full model. Figure 6D displays the increase in predictive power with the addition of each feature for a given neuron, when the features were added in their order of significance For the example cell, the full model's predictive power was achieved when 10 features were added, even though only 5 features were significant (paired rankWil, p = 0.047). Across cells, a median of 10 features were required to attain the full model's predictive power (Fig. 6E). These results suggest that a majority of features are necessary to account for the response variability of the neurons.
Given that our model requires combinations of multiple kinematic features to most accurately predict neuronal response, we sought to test whether neurons encode synergistic, coordination patterns of hand motion that occur during grasp. The principal components (PC) of all the kinematics features represent linear combinations of the joint angle kinematics across a 364 ms time window and can be interpreted as coordination patterns among the 12 kinematic feature trajectories. The first two eigenvectors of the principal components were highly consistent across datasets and across monkeys (Fig. 7A). Looking at the joint angle position terms, PC1 models a simultaneous flexion of the wrist and extension of the fingers, followed by flexion of the fingers. The IPIP feature for the second experiment in the first subject yielded a slightly larger magnitude in the movement trajectory. This is likely due to a different strategy used on that day. The IMP feature also shows magnitude differences in the movement trajectories across subjects and this is also likely due to differences in the strategies used by the monkeys. TI and FAN do not show as much modulation as the other features and are not as consistent across experiments; however, for three of the four experiments, PC1 shows an abduction of the fingers followed by an adduction. PC2 shows an extension followed by flexion of the wrist with a simultaneous flexion and extension of the fingers. In 3 of 4 experiments, it also shows a later abduction followed by a slight adduction of the fingers. We applied the same procedure to the PCs as we did to the features: we systematically removed one PC at a time and determined whether there was a significant decrease in the model's predictive power (Fig. 7B). In the given example cell, a significant decrease in predictive power could only be achieved by removing the second principal component (Fig. 7B, paired rankWil, p < 0.001). On a population level, 36% of neurons encoded the first and second principal components (Fig. 7C). To assess how many components were required to achieve the full predictive power of the cell, we cumulatively added the components to the encoding model, in their order of significance. When the components were added in this order, 81% of cells required between 1 and 3 components to attain the full model's predictive power of the model (Fig. 7E).
Spike history versus kinematics
The actual spike trains and the predicted firing rate using the full model for an example neuron are shown in Figure 8B. To examine the relative contribution of the kinematics and history terms of the model, we built and tested an encoding model that included just the kinematics terms (Fig. 8C; data from all 4 objects are shown in sequential blocks) and one that included just the spike history terms (Fig. 8D; data from all 4 objects are shown in sequential blocks). We observed that the kinematics model captured the coarse temporal structure of the spike train while the spike history model captured the fine and coarse temporal structure of the spike train but at a temporal lag from the actual spiking activity (Fig. 8E). In fact, the maximum correlation between the actual spikes and the predicted firing rates of the kinematic model occurred at an offset of −4 ms (median of offsets across neurons). The distribution of offsets between the actual spikes and the predicted firing rates of the spike history model was bimodal, with a peak at 0 ms and one at −28 ms. Many more neurons yielded an offset around −28 ms such that the overall median of the distribution was at −28 ms. These results indicate that the kinematics yield more precise onset timing information than the spike history terms. In terms of predictive power, both the kinematics and spike history models performed significantly better than chance on a population level (rankWil, pkins<0.001; phist<0.001). The kinematics and the history terms contributed to varying degrees to the predictive power of cells: for 38% of neurons, the predictive power of the kinematic model was significantly higher than the predictive power of the history model and for another 40% of neurons, the predictive power of the spike history model was significantly higher than the predictive power of the kinematic model (paired rankWil s <0.05). However, as a population, there was no significant difference between the kinematics and history models (Fig. 8F, rankWil, p = 0.0319). In contrast, for 84% of neurons, the full model, which included both history and kinematic terms, performed significantly better than the kinematics model (Fig. 8G; paired rankWil, p < 0.05 in all cases). And the full model performed better than the history models alone in 96% of the neurons (Fig. 8H; paired rankWil, p < 0.05 in all cases).
Generalization
A stronger test for the validity of an encoding model is to test its predictive power in a different context. We built the encoding model with data from 3 of the 4 object conditions and tested it on the data from the fourth object condition. We call these the “restricted” models. We repeated this procedure 4 times, to test whether the kinematics and spike history from any 3 object conditions are sufficient to predict the firing rate of a neuron recorded while a fourth object was grasped. The actual spike trains and predicted firing rates for a given neuron in each test condition are displayed in Figure 9, A and B. To assess how well the model generalizes in each test condition, we compare the difference in predictive power of the cells between the 4-object model and each of the restricted models (Fig. 9C,D). Though the mean ranks of the AUCs between the 4-object model and any of the restricted models were significantly different (Kruskal Wallis ANOVA test followed by paired rankWil, p < 0.05 in all cases), the predictive power of the neurons were well above chance (rankWil against a median of 0.5, p < 0.001 in all cases) indicating the restricted models did generalize to a large extent.
Discussion
We found that MI neurons in the hand/arm area encode joint trajectories that describe prehensile movements. Though a large percentage of cells encoded the angular velocities of the flexion/extension of the wrist and proximal phalanx of the index finger, at least 10 of the features were required to account for all the predictive power of the kinematics in at least half the tested population. By considering the principal components on the movement trajectories as features, we were able to account for all the predictive power of the majority of neurons with just up to three features. This indicates that neurons preferentially encode coordinated movements that describe a grasping movement, rather than a movement about any one joint.
Task-related neurons
Intracortical microstimulation studies suggest that MI has a horseshoe-like somatotopic organization, where the center is located in the anterior bank of the central sulcus and evokes movements of the hand and fingers while the periphery evokes movements of the arm (Park et al., 2001). We placed the array in the intermediate zone, an area between the center and peripheral zones that evokes hand and arm movements (Fig. 1C). It is possible that the 39% of neurons that did not modulate with the grasp task were reach-related. In fact, we recorded a reach-to-grasp task in the same session and found that more neurons modulated with the task (results not reported), presumably due to arm movements. This is consistent with studies that show that neurons in this area modulate their firing rate with arm movements (Georgopoulos et al., 1986; Moran and Schwartz, 1999b; Stark et al., 2007; Truccolo et al., 2008).
Grasping kinematics and relationship to unconstrained reach-to-grasps
Our experimental task aimed to isolate neural responses that are tied to grasping movements. Though the arm was constrained in the brace, the kinematic trajectories of the hand closely match those reported in studies that examined unconstrained reach-to-grasp movements (Fig. 2E) (Mason et al., 2001; Santello et al., 2002). As the kinematic trajectories in Figure 2E depict, a prehensile movement begins with a simultaneous extension and abduction of the fingers including the thumb. The degree to which the hand opens is related to the size of the object (Paulignan et al., 1991). As the hand approaches the object, all the fingers flex and adduct to grasp the object.
Spike history terms
Neurons in MI exhibit phasic and tonic firing rates that reflect fast and slow changes in their biophysical properties (Cheney et al., 1985; Kalaska et al., 1989). The spike history terms were used to model temporal correlations such as these. When we compared the predictive power of models that were built with data from the premovement period only, we observed that the number of cells that encoded the long time-scale effects dropped by at least 45%. Two possible interpretations are that these low-frequency spike history terms capture some aspect of the kinematics that is not modeled by the kinematic parameters or that they capture slow changes in the biophysical properties of cells that only occur during movement.
Time lag analysis
Assuming that neurons encode kinematics at a single time lag, the most significant lag time across neurons was found to be at 148 ms or 200 ms for all kinematic features, consistent with the lag times observed in reaching studies (Moran and Schwartz, 1999a; Paninski et al., 2004b). This study shows that that the predictive power of neurons was maximal for trajectory lengths of 364 ms, compared with shorter lengths including the kinematics at just their significant lag. Although the predictive power of the model increased significantly for a temporally extensive trajectory, the improvement was somewhat modest. Nevertheless, there was a systematic improvement in predictive power as the trajectory length was increased suggesting that grasp trajectory information is contained in the responses of MI neurons (Fig. 5C). Previous studies of reaching show that neurons shift their directional tuning in time during a movement and, as such, suggest that neurons encode trajectories instead of static kinematic parameters at their optimal lag (Hocherman and Wise, 1991; Hatsopoulos et al., 2007).
Single versus multiple features
Previous cortical encoding studies suggested that single neurons encode more than one individuated finger movement (Schieber, 1990; Schieber and Hibbard, 1993). If we define a finger movement as an angular position and velocity trajectory of a joint, we found similar results: the predictive power of the kinematics was only fully described by 10 kinematic trajectories in a majority of neurons. Despite this multifeature encoding, neurons encoded joint angular velocities more frequently than joint angular positions and preferentially encoded the velocity trajectories of the wrist and the metacarpophalangeal joint of the index finger compared with other trajectories.
Psychophysics studies suggest that the CNS controls prehension by means of two synergies, represented by the first two principal components on the joint angle kinematics of the fingers and wrist (Mason et al., 2001; Santello et al., 2002). Santello et al. (2002) describe the first principal component as an extension of the metacarpophalangeal and proximal interphalangeal joints to a maximum excursion, followed by flexion of the same joints. At the same time, the digits, are gradually abducted and later adducted toward the end of the reach while the thumb is only abducted. In this paper, we demonstrate that the first two principal components (PCs) were more frequently encoded by the neurons (Fig. 7C) and that a small set of principal components (between 1 and 3) account for the response variability of most MI neurons. Together, these results suggest that each neuron encodes a particular spatiotemporal trajectory that involved the coordinated kinematics of all the joint angles of the hand.
Spike history versus kinematics
This study extends previous encoding models of the hand by considering the continuous hand kinematics as well as the spiking history to predict the firing activity of a neuron on a fine time scale. We found that the spike history terms account for the fast and slow changes in firing rate whereas the kinematics only account for the slower changes. This is understandable given the relatively slow fluctuations of the kinematics. Since the overall modulation onset of an MI neuron is intimately linked to movement onset of the hand, it follows that the kinematics tended to more accurately model the onset of firing rate modulation. The spike history predicted the fine and coarse temporal structure of the firing activity but predicted modulation onset with a time lag of 28 ms. Spike history cannot predict modulation onset accurately because it carries only past spiking information. The degree to which the fine temporal structure of the neuron could be modeled is limited by the cutoff frequencies of the spike history bases, which are a form of low-pass filter. The first spike history term could capture spike frequencies as high as 60 Hz. Any faster changes in spike frequency could not be captured by the spike history terms since the other bases capture modulations at lower frequencies. For the majority of neurons, a full model yielded significantly higher predictive power than a model with either spike history or kinematics terms alone. This result indicates that spike history and kinematic terms hold non-redundant information about the firing rate of a neuron and can be used together to predict the firing rate of neurons at a fine time scale.
Generalization
We tested whether building an encoding model from the data from three objects could generalize to predict the firing rate collected when the monkey grasped a fourth object. The predictions from these restricted models were well above chance. Compared with a model that was built on data from all four objects, all four restricted models yielded significantly lower prediction accuracy (Fig. 9C,D). This slight decrease in performance is likely related to the fact that each object elicited some unique aspect of movement. This study was limited in the number of objects that were used in the analysis to get repeated trials of the same object condition. However, although we used a small number of objects, at least the first principal component closely matched the one reported in a human psychophysics study where 20 objects were tested.
Conclusion
MI, as a major output pathway to the spinal cord, is thought to elaborate the motor plan into a more detailed movement. Few encoding studies have characterized what parameters of the grasping movement are particularly relevant for MI neurons: one study recently showed that neurons in MI respond to grasp dimension and grasp force (Hendrix et al., 2009). Our model focuses on an encoding model of single neurons and accounts for the temporal progression of the grasp movement during preshaping before grasp, an important aspect of prehension (Mason et al., 2001; Santello et al., 2002).
We found that neurons preferentially encode different subsets of principal components, which model the coordinated movements that elicit a prehensile movement. This result suggests that the primary motor cortex encodes parameters that reflect ethologically relevant movements, rather than abstract, Newtonian parameters such as joint angular velocity. Though it is generally assumed that simple variables will predict responses to more complex ones, studies in different model systems have shown that using a variable that reflects an ethologically relevant behavior or natural sensory input, greatly sharpens the tuning properties of neurons (Margoliash, 1983; Margoliash and Fortune, 1992; David et al., 2004). With the advent of telemetric neural recording devices and marker-less motion tracking systems, it will be possible to further investigate primate behavior in more ethologically relevant contexts (Obeid et al., 2004).
Footnotes
-
This work was supported by National Institute of Neurological Disorders and Stroke Grant R01 NS045853 and by a gift from Ron and Steven Tarrson. We thank Lee Miller, Peter McCullagh, Konrad Körding, and Ian Stevenson for their thoughtful comments on our manuscript and analysis techniques. We also thank Dan Morris and John Donoghue for sharing their experiment control software with us. We also thank D. Paulsen, J. Mattiello, and A. Zook for their contributions.
- Correspondence should be addressed to Nicholas G. Hatsopoulos, Department of Organismal Biology and Anatomy and Committee on Computational Neuroscience, University of Chicago, 1027 E 57th Street, Chicago, IL 60637. nicho{at}uchicago.edu