Abstract
The ability of humans to reach and grasp objects in their environment has been the mainstay paradigm for characterizing the neural circuitry driving object-centric actions. Although much is known about hand shaping, a persistent question is how the brain orchestrates and integrates the grasp with lift forces of the fingers in a coordinated manner. The objective of the current study was to investigate how the brain represents grasp configuration and lift force during a dexterous object-centric action in a large sample of male and female human subjects. BOLD activity was measured as subjects used a precision-grasp to lift an object with a center of mass (CoM) on the left or right with the goal of minimizing tilting the object. The extent to which grasp configuration and lift force varied between left and right CoM conditions was manipulated by grasping the object collinearly (requiring a non-collinear force distribution) or non-collinearly (requiring more symmetrical forces). Bayesian variational representational similarity analyses on fMRI data assessed the evidence that a set of cortical and cerebellar regions were sensitive to grasp configuration or lift force differences between CoM conditions at differing time points during a grasp to lift action. In doing so, we reveal strong evidence that grasping and lift force are not represented by spatially separate functionally specialized regions, but by the same regions at differing time points. The coordinated grasp to lift effort is shown to be under dorsolateral (PMv and AIP) more than dorsomedial control, and under SPL7, somatosensory PSC, ventral LOC and cerebellar control.
SIGNIFICANCE STATEMENT Clumsy disasters such as spilling, dropping, and crushing during our daily interactions with objects are a rarity rather than the norm. These disasters are avoided in part as a result of our orchestrated anticipatory efforts to integrate and coordinate grasping and lifting of object interactions, all before the lift of an object even commences. How the brain orchestrates this integration process has been largely neglected by historical approaches independently and solely focusing on reaching and grasping and the neural principles that guide them. Here, we test the extent to which grasping and lifting are represented in a spatially or temporally distinct manner and identified strong evidence for the consecutive emergence of sensitivity to grasping, then lifting within the same region.
- dexterous object manipulation
- force control
- grasping
- lifting
- neural representations
- representational similarity analyses
Introduction
The efficiency and reliability of our interactions with objects of various properties in an inherently dynamic environment is an unequivocal testament to the deftness of human action. Clumsy disasters like spilling, crushing, or dropping are a rarity rather than the norm due to anticipatory force scaling at the end of a reach-to-grasp, before the start of object lifting (Schneider and Hermsdörfer, 2016). Key to this anticipatory process is the integration of grasping and lifting, evident not only when scaling grip to lift forces during load phase of object-centric actions (Johansson et al., 1992a,b), but also when lift force critically depends on the grasp configuration (i.e., digit location) at object contact (e.g., in object-centric actions requiring torque or twisting forces to counter off-centered mass distributions (Fu et al., 2010; Marneweck et al., 2016).
Although we have learnt a great deal about the underlying neural principles that guide reaching and grasping in terms of hand shaping, their integration with lifting has been largely neglected. Previous work has localized reaching and grasping into spatially distinct, functionally specialized networks (Jeannerod et al., 1994; Johnson and Grafton, 2003; Pisella et al., 2006). A dorsomedial stream linking superior parietal and dorsal premotor areas was thought to mediate reaching and a dorsolateral stream linking anterior intraparietal and ventral premotor areas mediating grasping (Borra et al., 2017). These clear-cut anatomical and functional dichotomies have been challenged by multiple reports, particularly with respect to grasp-related activity beyond the dorsolateral stream in dorsomedial, ventral, postcentral somatosensory and cerebellar regions in humans (Grol et al., 2007; Verhagen et al., 2008; Monaco et al., 2011; Begliomini et al., 2014; Fabbri et al., 2014; Vesia et al., 2017; Cavina-Pratesi et al., 2018; Marneweck et al., 2018; Klein et al., 2019) and in monkeys (Galletti et al., 2003; Fattori et al., 2004, 2009, 2010, 2012; Filippini et al., 2017; Nelissen et al., 2018; Santandrea et al., 2018). These studies have typically manipulated grasp and touch, grip types or wrist orientations. Whether finer-level aspects of grasping such as a grasp that needs to be precisely configured to an object's intrinsic properties and integrated with subsequent lift forces are under the same widespread neural control is currently unknown. Previously, we showed a widespread network of regions sensitive to differences in minimizing roll of an object with a left and right center of mass (CoM) (Marneweck et al., 2018). Integrating grasp with lift forces was key to success in that how the grasp is configured defines the individual digit lift forces needed to prevent the object from tilting. The aim of that study was unrelated to systematically varying either the grasp or lift force differences between left and right CoM manipulations, which could elucidate a region's sensitivity to each of these finer-level subcomponents of a dexterous object-centric action. Thus, it remains unknown whether the grasp and lift subcomponents of object-centric actions are represented by spatially separate regions or within the same regions in a temporally distinct manner.
The current study investigated spatial and temporal dynamics of neural representations of grasping and lift forces when they are to be integrated for a dexterous object-centric action in predefined cortical and cerebellar regions. During fMRI, subjects grasped with their thumb and index finger either collinearly (i.e., parallel) or non-collinearly (one digit is higher than the other) an object with either a left or right CoM, with the goal of minimizing object tilt (Fig. 1). An appropriate compensatory torque required a non-collinear lift force strategy (e.g., more force by the digit on the CoM side) when digits were collinearly constrained, and a more collinear lift force strategy (i.e., similar force by the digits) when the digits were configured non-collinearly. A Bayesian, variational implementation of representational similarity analyses (vRSA) (Friston et al., 2019) tested the evidence that models based on contrasts defined by effects of object CoM, grasp configuration or their interaction contributed to region response patterns. This RSA method is particularly effective for model comparisons. Of particular interest were regions with model results favoring CoM and interaction effects, since this identifies whether a region is sensitive to grasp configuration or lift force, consistent with the hypothesis that these processes are under spatially, separate neural control. For example, a region sensitive to a CoM effect, but only when the grasp configuration was non-collinear suggests this region encodes grasp configuration but not lift force (since the configuration effectively minimizes differences of lift forces). On the other hand, a region sensitive to a CoM effect but only when the grasp configuration was collinear suggests this region encodes lift force but not the grasp configuration (which is fixed in the collinear case). Notably, these models were run on convolution and deconvolution-modeled BOLD activity, the latter of which examined an alternative hypothesis: that these processes are encoded in a temporally distinct manner, such that some effects were contributing to a region's response pattern earlier than others (i.e., grasp before lift force).
Materials and Methods
Participants
Forty-eight right-handed healthy adult participants (median age: 21; range: 18–37; 24 women) with normal or corrected to normal vision took part in this study. All participants gave written informed consent and study procedures were approved by the Human Subjects Committee, Office of Research, University of California–Santa Barbara.
Materials, design, and procedures
Summary.
Subjects lay supine in the scanner with their head, shoulders, and right upper arm securely and comfortably padded to minimize excessive motion artifact. Following a T1- anatomical scan, BOLD activity was measured as subjects were asked to reach, precision grasp, and lift an inverted T-shaped object with a CoM on the left and right, respectively (Fig. 1). The goal of the task was to minimize tilting the object. The extent to which grasp configuration and lift force varied between left and right CoM conditions was manipulated by instructing subjects to grasp the object collinearly (requiring a non-collinear force distribution) or to grasp the object non-collinearly (requiring a more symmetrical force distribution). Following preprocessing of structural and fMRI data and first-level convolution- and deconvolution-based general linear models at the individual subject and run level, vRSAs assessed the extent to which predefined regions were sensitive to grasp configuration differences or lift force differences between left and right CoM conditions over a given trial or at differing time points during the course of the trial.
Materials.
The custom-made Plexiglas object (Fig. 1) was shaped like an inverted T with a vertical column (height: 13.0 cm; width: 3.4 cm; depth: 5.0 cm) with a pair of height-adjustable circular grasp surfaces (diameter: 1.5 cm) attached on both sides (between grasp distance: 8 cm), and a horizontal flat base (height: 0.5 cm; width: 18.0 cm; depth: 5.0 cm). On the horizontal base, a lead block (height: 2.7 cm; width: 5.0 cm; depth: 3 cm; mass: 441 g) was placed on the left or right side (condition-dependent) of the vertical column to create an off-centered mass distribution/CoM. This off-centered CoM was concealed by black covers (height: 3.4 cm; width: 7.2 cm; depth: 5.0 cm). To measure the position and roll of the object, near-infrared LED markers were affixed at the bottom and the top of the vertical column. From these markers, six degrees of freedom positions were recorded with a two-camera MRI-compatible motion tracking system (Precision Point Tracking System, WorldViz; frame rate: 150 Hz; camera resolution: 640 × 480 VGA; spatial accuracy at focal distance: submillimeter). The total mass of the object was 688 g (torque = 223 Nmm).
The object and a button box were positioned at arm's length on a wooden table over the hips of participants in a supine position inside the scanner. With a mirror attached to the head coil, subjects had full view of the object, the button box, and their hand throughout the experiment.
The object was oriented 30° in a counterclockwise direction with respect to the frontal plane. Pilot work showed that lifting the object in this orientation minimized the wrist's biomechanical constraints that influence object roll in a supine position (e.g., wrist stiffening) and maximized object roll in both CoM directions during the non-collinear grasp configuration. In other words, the object could roll a similar amount whether the torque of the object was on its left or right side. The results would have been different if the object was centered along the frontal plane (e.g., with the wrist being less able to roll toward the right than the left in this position).
Experimental design and procedure.
The experiment had four within-subject conditions (Fig. 1): manipulating an object with a (1) left and (2) right CoM at collinear grasp contacts and with a (3) left and (4) right CoM at non-collinear grasp contacts. This 2 × 2 design allowed setting up and testing model comparisons based on contrasts in which either the lift force or grasp configuration varied between left and right CoM conditions (to test the extent to which each of these contrasts contributed to a region's response pattern).
The compensatory torque (Tcom) can be generated by a combination of lift force (Ftan), grip force (Fn), and grasp configuration (GC), according to the following: where ΔFtan is the difference in lift force between the thumb and the index finger, d is the grip width (80 mm), and ΔGC is the difference between the vertical coordinate of the thumb and index finger position. In line with previous work (Marneweck et al., 2015), a pilot study on 10 healthy adult subjects doing the same task with a similar object outside the scanner showed no difference in mean grip force between the left and right CoM conditions (p's > 0.05). This suggests that the difference in lift force and grasp configuration of the thumb and index finger would be the predominant means of generating an appropriate torque (thereby achieving task success) on objects with left- and right-sided CoMs. In a previous study (Marneweck et al., 2018), we found no differences in grasp kinematics when subjects in a supine position manipulated the object inside the scanner compared with sitting up outside the scanner. Thus, we did not foresee differences in grasp kinematics while subjects grasped the object collinearly or non-collinearly and applying the appropriate lift forces while lying down (inside the scanner) in the current study.
In conditions 1 and 2, the grasp configuration of the thumb and index finger were constrained to be collinear (ΔGC = 0). An appropriate moment to counteract the external torque of the object with a collinear grasp configuration requires an asymmetric lift force distribution (more lift force by the thumb than index finger in the left CoM condition and vice versa in the right CoM condition). Thus, the lift force distribution varies between condition 1 and 2 while the grasp configuration is unchanged. In conditions 3 and 4, the grasp configuration of the thumb and index finger were constrained to be non-collinear, with the digit closest to the CoM on the upper grasp surface and the other digit on the lower grasp surface. With the vertical distance between the pair of grasp surfaces set at 1.60 cm, and assuming a mean grip force of 14 N (SD = 3.50) on an object of similar weight and torque (based on our pilot data), this digit positioning difference should equate to a more symmetrical lift force distribution by each digit on a successful trial (and thus less difference in digit lift force between left and right CoM conditions than left and right CoM conditions at collinear grasp contacts). Thus, in conditions 3 and 4, subjects were to use a non-collinear digit position partitioning (with the thumb higher than the index finger when the CoM is on the left, and vice versa when the CoM is on the right) in combination with more similar digit lift forces for generating an appropriate compensatory torque. Critically, in all four conditions, subjects were to modify and integrate load force distribution based on the specified grasp configuration to achieve an appropriate compensatory torque.
Subjects first performed a block of 32 familiarization trials in which they lifted an object where its CoM alternated between left and right every four trials. Following this, subjects performed six functional runs of 28 trials (seven blocked trials of each of the four conditions), with an intertrial interval randomly chosen to be 2, 3, 4, 5, or 6 s, with a rest period between each of the four conditions (during which time the experimenter changed the CoM). The duration of the rest periods was ∼30 s (depending on the time it took the experimenter to change the CoM and provide the instruction for the upcoming block of trials). The order of the blocks of collinear and non-collinear grasp trials and the order of the blocked CoM within each of those trials were counterbalanced across runs and subjects. Subjects were informed about the CoM and digit position condition before the start of each of the four blocks within a given run. On each trial, subjects pressed the button with their right hand in a relaxed position until an audio start tone instructed them to reach, grasp and lift the object, hold it at the height of a marker (4 cm) until an audio stop tone (4 s after the start tone), after which the object was to be placed in its original position and the hand returned to the button. After this, an error tone was played if the object rolled >5° in either direction during the trial. The start tone of the first trial in each functional run always coincided with the beginning of a new functional image.
Anatomical and functional MRI data were acquired using a Siemens 3T Magnetom Prisma Fit (64-channel phased-array head coil). High-resolution 0.94 mm isotropic T1-MPRAGE (TR = 2500 ms, TE = 2.2 ms, FA = 7°, FOV = 241 mm) sagittal sequence images were acquired of the whole brain. Following this, subjects performed the object manipulation task during which BOLD contrast was measured with a CMRR multiband (University of Minnesota) T2*-weighted echo planar gradient-echo imaging sequence (TR = 400 ms, TE = 35 ms, FA = 52°, FOV = 192 mm, multiband factor 8). Each functional image consisted of 48 slices acquired parallel to the AC-PC plane (3 mm thick; 3 × 3 mm in-plane resolution). The behavioral experimental protocol was interleaved between 25 TRs at the start and at the end of each functional run.
Statistical analyses
Kinematic data processing and analyses.
Data collected were filtered using a fourth order Butterworth filter with a cutoff frequency of 5 Hz. Reach onset was defined as the button release onset. Object lift onset was defined as the time point at which the vertical position of the object went >1 mm and remained above this value for 20 samples. The end of the lift was defined as the time at which the hand was back at the button. A full trial was classified as the duration from reach onset to the end of the lift. Object roll was defined as the angle of the object in the oblique plane. Peak object roll was recorded shortly after lift onset (∼250 ms) before somatosensory feedback resulted in corrected feedback responses to counter object roll. Trials with object roll > 5° were classified as errors. Error trials were not of interest and were not analyzed, but they were modeled as a separate regressor in the general linear model (GLM) of each run for each subject. We were only interested in successful trials, which depended on the subject having the appropriate sensorimotor memory to generate an appropriate compensatory moment needed to minimize tilt (since feedback of object properties only became available after lift onset). Therefore, any carry-over adaptation effects to the next CoM block of trials were not of interest to the current research question and excluded from the analyses. Nevertheless, these carry-over adaptation effects were minimal with a very small proportion of trials classified as errors (M = 0.029, SD = 0.027).
MRI data processing and analyses.
MRI data were preprocessed and analyzed using SPM12 (Wellcome Trust Center for Neuroimaging, London, UK) and FSL (https://fsl.fmrib.ox.ac.uk/fsl/fslwiki/; Jenkinson et al., 2012). Each subject's functional images from all runs were spatially realigned to a mean image using second degree B-spline interpolation, and then coregistered to the T1 using SPM. Between-subject normalization of the cerebellum was conducted using the SUIT SPM toolbox (Diedrichsen, 2006; Diedrichsen et al., 2009, 2011; Diedrichsen and Zotow, 2015) (interpolation: trilinear, voxel size: 2×2×2 mm) and between-subject normalization of the rest of the brain was conducted with SPM's normalize function.
A Bayesian implementation of RSA, vRSA (Friston et al., 2019) was applied using an adaptation of the MATLAB script DEMO_CVA_RSA.m available in SPM12. This was used to assess the evidence for condition-specific patterns of responses distributed over voxels in a set of predefined regions. In an initial preparatory step, we extracted regional multivoxel patterns for each of the conditions of interest. This was done two ways: by performing both a convolution- and deconvolution-based GLM for each run and for each subject. In the convolution method, a GLM with events convolved with the standard canonical HRF basis function in SPM was estimated, with four experimental conditions (left CoM collinear grasp, left CoM non-collinear grasp, right CoM collinear grasp, right CoM non-collinear grasp) and an error condition (if any occurred in a run) entered as regressors. The rationale for this GLM was to identify those regions that were sensitive by RSA analysis to any of the tested contrasts regardless of when the distinctive patterns occurred across the whole trial. In addition, we used the opportunity to replicate findings from a previous study where we used an RSA approach based on crossnobis distances to identify neural representations encoding differences in torque direction (by contrasting left and right-weighted object interactions) (Marneweck et al., 2018). Thus, the whole trial was modeled for each event, from reach onset (button release) to the end of the lift (button down). For regions showing sensitivity to any of the main effects or interaction effects as tested by vRSA (see below), a subsequent deconvolution-based GLM approach, with a finite impulse response (FIR) function for each run and subject was computed, with the onsets for each of the four experimental conditions set to lift onset (window length: 7.2 s; order: 800 ms). This time window and order was selected to sufficiently track activations through the peak of the HRF, which was assumed to occur ∼6 s after lift onset. The rationale for deriving β values from a deconvolution-based GLM approach was to use vRSA to assess our hypothesis that condition-specific responses might occur in a temporally distinct manner (i.e., first grasp configuration, then lift force). In both convolution and deconvolution-based GLMs, the RobustWLS Toolbox in SPM (Diedrichsen and Shadmehr, 2005) was used to down-weight functional images with high noise variance to account for movement artifact.
A second preparatory step to vRSA was the extraction of the GLM-derived β values for each of the four conditions from regions of interest (ROIs) using FSL's fslmeants. β values from ROIs were extracted separately for each run. Predefined cerebellar and cortical ROIs were selected that have previously been shown to be sensitive to differences when manipulating objects of different torques (Marneweck et al., 2018). Left-hemisphere cortical ROIs (Fig. 2A) were extracted from the SPM Anatomy Toolbox (Eickhoff et al., 2005, 2006, 2007) including motor area 4a, motor area 4p, anterior intraparietal area (AIP), superior parietal area 5 and 7 (SPL5, SPL7), primary central sulcus (PSC/SI), parietal operculum area 1 and 4 (OP1, OP4), Broca area (BA) 44, inferior parietal area (IPL), lateral occipital cortex (LOC), and a control region, auditory cortex (AUD). Dorsal and ventral premotor areas (PMd and PMv) and supplementary motor area (SMA) were free-drawn on a standardized surface mesh in SUMA (Saad et al., 2004) based on predefined anatomical parcellations (Geyer et al., 1996; Picard and Strick, 2001; Tomassini et al., 2007; Destrieux et al., 2010), which were projected to standard MNI space and mapped backed to the subject's T1-weighted image (Barany et al., 2014). Superior parieto-occipital cortex (SPOC) was defined according to a previous functional study (Fabbri et al., 2012) as per (Di Bono et al., 2015) (centered MNI coordinates: −17, −76, 40; diameter: 8 mm). Cerebellar ROIs (Fig. 2B) were extracted from a recently published cerebellar functional atlas (King et al., 2019), which activated during right-hand movements (region 2) and motor planning (region 2 and region 4). β values from a cerebellar control region that activated emotional and language processing (region 7) were also extracted.
Similar to a more traditional RSA approach, vRSA contrasts between-condition differences in spatial voxel patterns. Whereas these between-condition contrasts are specified in terms of representational dissimilarity matrix in traditional RSA, in vRSA, the contribution of between-condition contrasts to a region's response pattern is expressed in terms of second-order similarity or covariance matrices. These matrices summarize the relationship between patterns in terms of each of the experimental conditions. vRSA was chosen over traditional RSA for its robustness in identifying the consistency of a given contribution across multiple runs and also at the between-subject level. As with other pattern component modeling methods, it provides a formal method for testing the effect on the contribution of more than one contrast to a region's response pattern while taking into account all specified contrasts and their interactions in a model comparison framework. Specifically, vRSA uses Bayesian model comparisons to assess the evidence for a region to be sensitive to different contrasts by comparing the log evidence between the group level model contrast and that same model contrast when one hyperparameter of another model contrast is decreased toward zero by specifying precise hyperpriors, thereby essentially removing its contribution. A relative log evidence of 3 corresponds to a Bayes factor of approximately exp(3) ≈ 20 to 1. The Bayes factor is a fundamental part of the Bayesian approach to testing hypotheses, providing a continuous degree or measure of evidence for the null and alternative hypotheses, H0 and H1 (see Dienes and Mclatchie, 2018, for a recent review). It also provides evidence for competing models as is done here, which can provide evidence that a region is sensitive to more than one contrast. With a Bayes factor of 1, both H0 and H1 models predict the data equally well and the evidence does not favor either model over the other. The evidence favors H1 over H0 when the Bayes factor increases beyond 1 (toward infinity). The evidence favors H0 over H1 when the Bayes factor decreases <1 (toward 0). Dissimilar to fixed significance or threshold levels of the Neyman-Pearson frequentist approach, this Bayesian approach offers a continuous degree of evidence. Nonetheless, rough guidelines have been provided similar to that of Cohen (1988) for effect sizes. Particularly, Jeffreys (1998) and Kass and Raftery (1995) have suggested that a Bayes factor of approximately 3 matches a “substantial” amount of evidence that a contrast of interest contributes to a region's observed response pattern. Moreover, Dienes (2014) argued a Bayes factor of approximately 3 occurs when a result is just significant using the frequentist approach. As will be seen below, our log evidence values were well above this criterion value.
As outlined in Friston et al. (2019) technical note describing variational implementation of representational similarity analyses (vRSA), a crucial step in this analysis is the introduction of a prior on the hyperparameters (also known as hyperpriors). For this study the following priors were used within the vRSA script: hyperprior expectation in log-space = −32; hyperprior covariance in log-space = 256; and prior expectation and covariance of reduced model = 1/256. The hyperprior expectation value represents the prior belief in the strength of each of the tested components e.g., exp(−32) = 1.2664e-14. That is, each component is something positive that is really close to zero. The larger this value, the stronger the belief in the tested component. The second parameter, the hyperprior covariance, is the prior belief about the covariance between the tested components. For example, if a region has a pattern for a CoM effect, what is our belief it also has a pattern for a grasp configuration effect or their interaction. The larger the number, the stronger the belief that they could covary, making it easier to get multiple components significant. We selected the default hyperprior values from the spm_reml_sc.m script (−32, 256) based on the lack of research on how the brain represents grasping and lifting in a coordinated way. In other words, we had little evidence or prior beliefs as to how the brain might represent these behaviors and therefore adopted a conservative uninformed approach in the selection of hyperpriors.
With the present study's 2 × 2 design (CoM × grasp), the contribution of three contrasts to ROI response patterns were assessed using vRSA. The first, testing a main effect CoM, contrasted conditions in which the object was lifted with a left and right CoM (regardless of non-collinear and collinear grasp contact points). The second, testing a main effect of grasp, contrasted conditions in which the object was lifted at non-collinear and collinear grasp contact points (regardless of the CoM). The third, testing an CoM × grasp interaction effect, contrasted left and right-CoM conditions at collinear and non-collinear grasp contact points, respectively. A significant interaction would suggest that the region is either sensitive to a left versus right CoM difference at non-collinear or collinear grasp contact points. As will be detailed below, vRSAs on the convolution-modeled data for the most part showed that ROIs sensitive to a CoM effect were also sensitive to a grasp effect, and interactions were rare. This suggested that regions encoded both torque direction, grasping and/or lifting (in large part independently) and with a convolution-based GLM it was difficult to ascertain the extent to which these processes were encoded in a temporally distinct manner. To this end, a deconvolution-based vRSA approach explored the possibility that these processes are contributing to a region's response at differing time courses. Furthermore, regions showing interaction effects would suggest the region is either sensitive to grasp configuration differences between left and right CoM conditions or lift force differences between left and right CoM conditions. To further explore these, the contribution of two additional contrasts to a given ROI's response pattern was tested using vRSA on deconvolution-modeled BOLD activity: non-collinear left CoM versus non-collinear right CoM (manipulating grasp configuration) and colinear left CoM versus collinear right CoM (manipulating lift force).
Critically, the Bayes criterion assessing the merits of the different models (i.e., across different FIRs and contrasts) are all derived from the same hierarchical Bayes estimation. There are no repeated estimations in this context. In hierarchical Bayesian analyses that incorporates the between-subject consistency of a given effect/component as we have performed, the interpretation of the data is not influenced by multiple comparisons (Gelman et al., 2012; Kruschke, 2015; Gelman and Loken, 2016). Finally, Friston et al. (2019) details how the vRSA approach accounts for covariance within a region of p voxels. This covariance is estimated in the hierarchical Bayes model (equation 7 in Friston et al., 2019). This accounts for ROIs with different numbers of voxels and covariance between voxels, analogous to crossnobis distances in RSA. As a secondary check, we correlated ROI sizes with log evidence values for each of the components from the first vRSA and found no correlation between ROI size and log evidence values (CoM: r = 0.21, grasp: r = −0.01; interaction: r = −0.18).
Results
Representational mapping of grasp configuration, lift force, and torque
The first convolution-based vRSA tested whether any of the predefined regions are sensitive to contrasts of conditions in which the CoM varies (regardless of the grasp configuration on the object), the grasp configuration on the object varies (regardless of the CoM), and finally, a contrast suggesting an interaction between CoM and grasp configuration. Table 1 shows log evidence values that quantify the model evidence for response patterns related to each of these contrasts in each of the ROIs. Of note, most regions with CoM effects lacked interaction effects, which suggests these regions were sensitive to the torque difference regardless of the way it was achieved (e.g., grasp configuration or lift force asymmetry). Instead, most regions with CoM effects also showed significant grasp configuration effects, namely in motor (4a), premotor (SMA), somatosensory (PSC and OP1) and parietal regions (SPL7 and AIP) regions. Conversely, single main effects were seen selectively in PMd and cerebellar regions 2 and 4 (CoM effect) and in LOC and IPL (grasp configuration effect), and interactions were present only in PMv and BA44.
Representational mapping of temporal relationship between grasp configuration, lift force, and torque
The results from the convolution-based vRSAs suggested that most regions sensitive to intrinsic object properties with varying torques (i.e., CoM effects) are also sensitive to differences of grasp configurations (i.e., grasp effects). Although these results support the plausibility that the integration between grasp configuration, lift and torque are configured by these particular ROIs, at the same time, these results, alongside the lack of demonstrable interaction effects, raise questions as to how grasping and lifting are encoded with respect to each other. To this end, a deconvolution-based vRSA approach explored the possibility that these processes are contributing to a region's response in a temporally distinct manner with the first bin aligned with lift onset. These RSAs were run on the 13 ROIs that showed a significant effect from the previous set of analyses. As a first step, we combined results across all 13 ROIs to identify overall timing differences for the emergence of distinct RSA patterns for the three contrasts. This more succinct descriptive approach was taken over plotting individual ROI sensitivities to each of the model contrasts, since we were primarily interested in the time at which the contrasts contributed to response patterns across the group of ROIs rather than identifying which ROIs were sensitive to which contrasts. As will be detailed below, we focus on individual-level ROI results in analyses that followed to distinguish regional patterns sensitive to grasp configuration and lift forces, respectively, which could not be distinguished with the current set of model contrasts. Figure 3 shows the number of regions at each time point where the log estimates exceed 3, indicating strong evidence that a condition contrast (CoM, grasp configuration, interaction) contributed to an ROI's response in each of 9 FIR bins. Whereas all three contrasts contribute to an increasing number of ROI response patterns over time, there is marked temporal difference when each contrast contributes among the 13 ROIs. Although all the contrasts peak between 4.8 and 5.6 s after lift onset (FIR 6), the rate that a given contrast contributes varies. Specifically, evidence for grasp and interaction effects start to consistently increase at an earlier time point than the CoM effect. Similarly, grasp and interaction effects drop off at a faster rate than the CoM effect.
Figure 3-1
The results depicted in Figure 3 point to a temporal dissociation in regional sensitivity toward that which is unique to interaction and grasp effects and that which is unique to a CoM effect. CoM effects suggest a given ROIs sensitivity to the overall torque force regardless of the way it is achieved. Grasp configuration effects were difficult to interpret because it could suggest an ROI is sensitive to either or both grasp configuration or lift force differences, both of which varied (albeit subtly) between non-collinear and collinear grasp configurations within a given CoM. To clarify, this effect reflected a region's sensitivity to the difference when grasping an object with a left CoM (or with a right CoM) collinearly and non-collinearly. In this contrast, the grasp position is different (non-collinear vs collinear) but the lift force distribution required to generate the compensatory torque in the opposite direction of CoM will also be different. For example, with a collinear grasp on an object with a left CoM, subjects had to apply more lift force by the thumb than index finger, whereas with a non-collinear grasp, subjects had to apply a more symmetrical lift force distribution by each of their two digits. Thus, regions with these grasp configuration effects could be reflective of a grasp configuration or lift force difference. In addition, an interaction effect in conjunction with a CoM effect suggested a region is either sensitive to a grasp configuration difference between left and right CoM conditions at non-collinear grasp contacts or to a lift force difference between left and right CoM conditions at collinear grasp contacts. To distinguish between these possibilities (i.e., whether a region is sensitive to grasp or lift force differences), we conducted two additional contrasts on the ROIs with combined CoM and interaction effects (however, all log values indicative of strong evidence for all components in all FIR bins can be found in Fig. 3-1). Specifically, the two additional model contrasts examined the evidence that an ROI was sensitive to differences between left and right CoM conditions at collinear (suggesting the encoding of lift force) or non-collinear grasp contacts (suggesting the encoding of grasp configuration).
Figure 4 shows log estimate values in each of the 9 FIR time bins reflecting the evidence for a contrast between left and right CoM conditions at non-collinear grasp contact points (manipulating grasp configuration) and for a contrast between left and right CoM conditions at collinear grasp contact points (manipulating lift force). All of the regions are at some point sensitive to both grasp configuration and lift force effects. In addition, in all regions a temporal dissociation is evident between sensitivity to grasp configuration and lift force effects during early FIR bins. In FIR bins 1 through 3, most regions are only sensitive to grasp configuration, with the exception of the cerebellar region showing sensitivity to lift force by FIR bin 2. Most other cortical regions only start to show sensitivity to lift force in FIR bin 4 onwards. Regions showing early sensitivity to lift force occur in isolation (e.g., PMv, bin 4) and over time these effects are seen in concert with a reemergence of a sensitivity to the grasp effect (AIP, SPL7, PSC, CER2, LOC).
Discussion
The current study investigated the spatial and temporal dynamics of neural representations that configure the grasp and initiate the appropriate lift force for a dexterous object-centric action. A behavioral paradigm that varied the grasp configuration or lift force requirement between lifting an object with a left and right CoM, in conjunction with Bayesian RSAs on corresponding contrasts from convolution- and deconvolution-modeled fMRI data, revealed strong support for the hypothesis that the same regions represent both of these processes in a temporally distinct manner.
Results from the convolution-based vRSAs showed that many regions are sensitive to differences in manipulating objects of varying torques, as evidenced by widespread CoM effects. These findings are consistent with findings from previous studies adopting more traditional RSA approaches (Marneweck et al., 2018; Klein et al., 2019). Interestingly, these widespread CoM effects were mostly present in conjunction with grasp effects. Such combined effects, with a lack of interaction effects, support the idea that the same regions are sensitive to grasp configuration, lift force distributions, and torques. Nevertheless, how these regions allow such subcomponents of object-centric actions to be generated with respect to each other was not clear based on the convolution results alone. To address this, vRSAs on deconvolution-modeled BOLD activity were conducted to test the hypothesis that these subcomponents of object-centric actions are encoded in a temporally distinct manner (i.e., one after the other).
At the network-level (Fig. 3) it was clear that that the consistent increase in regional patterns sensitive to grasp and the interaction of grasp and CoM was emerging earlier and also disappearing earlier than the consistent increase in regional patterns sensitive to the CoM effect. This suggests that torque generation differences (CoM effect) were contributing to most response patterns later than the grasp and lift force components (grasp and interaction effects). At the individual ROI-level, the deconvolution-based results identified additional regions sensitive to both CoM and interaction effects than that shown by the convolution results. To further characterize the interaction term, two additional model comparisons assessed the extent to which these regions were sensitive to the grasp configuration or lift force differences between left and right CoM conditions. In line with our hypothesis, results showed that all of these regions were sensitive to both grasp configuration and lift force however the time course of sensitivity to each effect varied, with that of grasp preceding that of lift force.
Dorsolateral regions, PMv and AIP, along with SPL7, somatosensory PSC, ventral LOC, and cerebellar regions were sensitive to subtle variations in grasp configurations that dictate subsequent lift force distributions by the thumb and index finger. This is the first report giving evidence that these regions are not simply representing grip types, aperture, and orientation but are also sensitive to finer-level aspects of grasp that require integration with lift forces to generate the appropriate compensatory torque to counter the intrinsic object torque property. That sensitivity to grasp configuration was not seen in regions within the dorsomedial stream is in line with the proposal that dorsolateral and dorsomedial streams contribute to grasping in different ways (Galletti and Fattori, 2018). Our results suggest that fine level grasp control requiring inextricable linking with lifting seems more under dorsolateral than dorsomedial control. This functional difference is also supported by studies showing increased dorsomedial activity when shaping the hand around a large compared with small object (Grol et al., 2007) and by studies showing increased dorsolateral activity when precision than whole-hand or coarse-level grasping (Begliomini et al., 2007; Cavina-Pratesi et al., 2018). Furthermore, our results show that hand shaping and grasping is not exclusively encoded by regions in the dorsolateral stream (Galletti et al., 2003; Fattori et al., 2004, 2009, 2010, 2012; Grol et al., 2007; Verhagen et al., 2008; Monaco et al., 2011; Begliomini et al., 2014; Filippini et al., 2017; Vesia et al., 2017; Nelissen et al., 2018; Cavina-Pratesi et al., 2018; Marneweck et al., 2018; Santandrea et al., 2018; Klein et al., 2019) with involvement noted in ventral (LOC), somatosensory (PSC), superior parietal (SPL7) and cerebellar regions.
The current study sought to clarify whether the grasp and lift subcomponents of object-centric actions are represented by spatially separate regions or within the same regions in a temporally distinct manner. As hypothesized, the same regions (PMv, AIP, LOC, SPL7, CER2, and PSC) that are sensitive to variations in grasp configurations were also shown to be sensitive to variations in a lift forces when the grasp was constrained to be collinear. Many of these regions (e.g., PMv, AIP, PSC, and cerebellum) have consistently been reported to be sensitive to lift force (compared with rest) or to overall lift force differences based on object weight or density (Kinoshita et al., 2000; Ehrsson et al., 2003; Schmitz et al., 2005; Chouinard et al., 2009; Cavina-Pratesi et al., 2018). Extending that previous work, our results provide strong evidence that these regions can encode more than the overall combined lift force distribution of involved digits. Specifically, these regions are sensitive to individual digit lift force distributions (which varied between left and right CoM conditions), even when the overall combined lift force by both digits were the same between these conditions. In addition, the strong evidence for LOC sensitivity to lift force differences reported here add to the growing evidence base that this ventral visual stream node is also sensitive to nonvisual features of object-centric actions (Gallivan et al., 2014).
This study is the first to test and support the hypothesis that the same region encodes both grasp and lift parameters of a dexterous object-centric action rather than being encoded by spatially separate regions. In line with how these subcomponents emerge behaviorally, the contribution of grasp configuration preceded the contribution of lift forces to a given region's response patterns. Within the first 2.4 s following lift onset, these regions were only sensitive to grasp configuration differences with the exception of cerebellar region 2 showing sensitivity to lift force differences between 0.8 to 1.6 s following lift onset. Other regions' sensitivity to lift force was apparent 3.2 s after lift onset in AIP and PMv, 4.8 s after lift onset in SPL7, 5.6 s after lift onset in PSC, and 6.6 s after lift onset in LOC. We acknowledge that our fMRI paradigm and experimental design was not optimized to definitively demarcate which time bins correspond to prelift anticipatory processes versus postlift feedback processes, both of which varied between left and right CoM collinear grasp conditions. Nevertheless, based on the average delay of the hemodynamic response known to peak 4 to 6 s later, activity occurring within the first 4 s following lift onset likely corresponds to activity before lift onset. Thus, the first voxel pattern reflecting grasp configuration differences (0.8 - 1.6 s after lift) likely emerges before object contact. Similarly, sensitivity to lift forces differences before 4 s postlift in cerebellum, AIP, and PMv likely reflect a representation of anticipatory force control. Whether SPL7, PSC, and LOC relate to anticipatory or feedback control is harder to distinguish. Earlier involvement of AIP than SPL has also previously been reported in the context of a visually guided grasp task (Tunik et al., 2008). Similarly, PSC has previously been shown to decode object lifting during the execution/feedback phase but not the planning phase in a delayed-grasp paradigm (Gallivan et al., 2014).
A limitation of the currently study was the lack of in-scanner kinetic and kinematic recordings. However, our previous study (Marneweck et al., 2018) in which kinematic recording of grasp positions on the same object showed that subjects were very accurate in grasping the same object at instructed contact points. Conducive to this behavior in both studies was a very small diameter of the circular grip surfaces (1.5 cm) with little to no opportunity for digit position variability. Furthermore, in this study the experimenter inside the scanner during the experiment consistently ensured that participants were following the correct digit position instructions. Force transducers inside the scanner would have strengthened the case that forces were more or less similar between left and right CoM conditions at non-collinear contact points (and thus the only aspect varying between these conditions would be the digit position differences driving the pattern differences in ROIs). Depending the grip force, there is a possibility for lift forces to be slightly different between these left and right CoM conditions. However, if this were the case, then we would not have found the temporal dissociation between when a region detects a difference between left and right CoM conditions at collinear contact points (definitely reflective of a lift force difference) and when it detects a difference between left and right CoM conditions at non-collinear contact points (reflective of the grasp position difference). To test whether these regions are also sensitive to subtler variations in grasp position and lift force, a future study could contrast off-centered CoM conditions at collinear and non-collinear grasp contact points with a centered CoM condition at collinear contact points. Specifically, lift force varies more subtly than two opposing CoM conditions in a contrast between a centered CoM condition at collinear contact points and an off-centered condition at collinear contact points. Thus, a region/FIR sensitive to lift force should detect this subtler difference than the contrast tested in the current study. Likewise, grasp configuration varies more subtly than two non-collinear grip contact points tested here in a contrast between a centered CoM condition at collinear contact points and an off-entered CoM condition at non-collinear contact points. Thus, a region/FIR sensitive to grasp position should detect this subtler difference.
As a final note, we consider the observed discrepancies between the convolution and deconvolution-based results. A lack of CoM effects in LOC were seen in the convolution-based results whereas this effect was present in the deconvolution results. The deconvolution-based analyses focused on response patterns during a 7.2 s time window from lift onset, which, given the delay of the hemodynamic response, tracked the extent to which a given model contrast reflected underlying regional patterns commencing with the reach to shortly after lifting the object. The convolution-based analyses focused on response patterns during the entire trial, which would include the reach, the lift, release and return of the hand back to the start button. Thus, the effect of CoM in only the deconvolution-based result in LOC, seems indicative of this region being sensitive to differences in CoM before the commencement of or shortly after lift. This possibility fits with previous accounts of this region encoding memory representations of object properties (e.g., weight; (Gallivan et al., 2014). Similarly, effects of CoM and grasp in OP1 that are seen in only the convolution- but not the deconvolution-based results might reflect this region's sensitivity to postlift related behavior (such as somatosensory feedback of object properties after lift onset), which is less representative in the time window modeled in the deconvolution analyses. The lack of overall interaction effects in the convolution-based data also makes sense in that interactions in the deconvolution data occur during only a part of the whole trial (reach to lift) and taper off by the end of that time. Finally, it might be questioned why an interaction effect is seen (instead of a CoM and grasp effect) in the deconvolution-based results, if the same region is shown to be sensitive to both grasp and lift force differences between left and right CoM conditions. Although one would certainly expect this to be the case if the same region was always sensitive to both grasp and lift force in each of the time bins, this is not what the data show. Most of the time there is much stronger evidence for one contrast over the other (so it is possible that the interaction term tracks the contrast with the strongest evidence at a given time bin).
Footnotes
This work was supported by the National Health and Medical Research Council (C.J. Martin Biomedical Fellowship GNT1110090 to M.M.) and the Rutherford Fett Fund (S.T.G.). We thank Mario Mendoza, Danny Toomey, and Naomi Meave Ojeda for assistance with data collection.
The authors declare no competing financial interests.
- Correspondence should be addressed to Scott T. Grafton at stgrafton{at}ucsb.edu