Abstract
Actions can be understood based on form cues (e.g., static body posture) as well as motion cues (e.g., gait patterns). A fundamental debate centers on the question of whether the functional and neural mechanisms processing these two types of cues are dissociable. Here, using fMRI, psychophysics, and transcranial magnetic stimulation (TMS), all within the same human participants, we show that mechanisms underlying body form and body motion processing are functionally and neurally distinct. Multivoxel fMRI activity patterns in the extrastriate body area (EBA), but not in the posterior superior temporal sulcus (pSTS), carried cue invariant information about the body form of an acting human. Conversely, multivoxel patterns in pSTS, but not in EBA, carried information about the body motion of the same actor. In a psychophysical experiment, we selectively impaired body form and body motion discriminations by manipulating different visual cues: misaligning the ellipses that made up a dynamic walker stimulus selectively disrupted body form discriminations, while varying the presentation duration of the walker selectively affected body motion discriminations. Finally, a TMS experiment revealed causal evidence for a double-dissociation between neural mechanisms underlying body form and body motion discriminations: TMS over EBA selectively disrupted body form discrimination, whereas TMS over pSTS selectively disrupted body motion discrimination. Together, these findings reveal complementing but dissociable functions of EBA and pSTS during action perception. They provide constraints for theoretical and computational models of action perception by showing that action perception involves at least two parallel pathways that separately contribute to the understanding of others' behavior.
- biological motion
- extrastriate cortex
- form vs motion
- functional imaging
- transcranial magnetic imaging
- visual processing
Introduction
We effortlessly identify actions, intentions, and emotions based on the body actions of others. This remarkable skill relies on the accurate discrimination of both body form and body motion cues (Giese and Poggio, 2003; de Gelder, 2006; Blake and Shiffrar, 2007). In the present series of experiments, we asked whether the perceptual discrimination of body form and body motion cues rely on common or dissociable neural mechanisms.
There has been a considerable debate on the role of form and motion in body action discrimination (Beintema and Lappe, 2002; Tadin et al., 2002; Lange and Lappe, 2006; Garcia and Grossman, 2008; Thirkettle et al., 2009; Miller and Saygin, 2013), which has led to two main theoretical models. One account proposes that actions are processed in two parallel pathways, with a ventral stream pathway analyzing body form signals and a dorsal stream pathway analyzing body motion signals (Giese and Poggio, 2003). An alternative view posits that actions are recognized solely on the basis of the concatenation of static snapshots of body poses (Lange and Lappe, 2006). Neurons integrating static body poses have recently been identified (Singer and Sheinberg, 2010; Vangeneugden et al., 2011). However, these findings do not rule out the existence of a separate pathway analyzing actions based on motion cues alone.
The perception of body actions activates multiple regions in posterior temporal cortex (Grosbras et al., 2012), with the most consistently implicated regions being the posterior superior temporal sulcus (pSTS) and the extrastriate body area (EBA; Grossman and Blake, 2002; Peelen et al., 2006). Of these regions, the pSTS is hypothesized to be involved in processing body motion cues (Grossman et al., 2010), whereas the EBA is hypothesized to be involved in processing body form cues (Peelen et al., 2006). This distinction is currently under debate, with evidence consistent (Michels et al., 2005; Downing et al., 2006) but also inconsistent (Jastorff and Orban, 2009; Jastorff et al., 2012) with it. One of the obstacles to empirically dissociate body form and body motion processing is that the perception of body form and body motion are intimately linked and quickly integrated. For example, intact body motion gives strong clues about the underlying body form (Peelen et al., 2006).
To examine the contributions of form and motion processing to body action perception, we first used fMRI to examine stimulus information contained in temporal brain areas using multivoxel pattern analysis (MVPA). We observed a striking double dissociation, with patterns of activity in EBA representing body form information and patterns of activity in pSTS representing body motion information of the same actor. We then devised a novel stimulus set that allows form and motion discriminations to be dissociated psychophysically. Finally, we used offline repetitive transcranial magnetic stimulation (TMS) to selectively disrupt these discriminations by targeting the regions that were found in the fMRI study to contain information about body form (EBA) and body motion (pSTS). Together, these results provide strong converging evidence that the perception of body form and body motion rely on distinct functional and neural mechanisms.
Materials and Methods
Participants
Twelve healthy volunteers (8 females, ages 24–40 years) participated in all experiments. All participants were neurologically healthy, right-handed, had normal or corrected-to-normal vision, gave written informed consent approved by the Institutional Review Board of the University of Trento, were compensated for their participation and were naive to the purpose of the experiments. None of the participants had seen point-light or ellipse walkers before. Participants first completed the extensive fMRI session, followed by the psychophysical experiment, and ended with two TMS stimulation sessions.
fMRI
General information.
Participants first completed two localizer experiments, consisting of three pSTS/human middle temporal complex (hMT+) localizer runs of 158 volumes each and two EBA/fusiform body area (FBA) localizer runs of 151 volumes each. After these localizer runs, they performed four runs of the main experiment (207 volumes each). In all functional imaging runs, participants viewed the stimuli binocularly through a mirror above the head coil. Stimuli were back projected onto a translucent screen by a liquid crystal projector at a frame rate of 60 Hz and a screen resolution of 1280 × 1024 pixels. Stimulus presentation was controlled by a PC running the Psychophysics Toolbox package in Matlab (MathWorks).
Scanning parameters.
A 4T Bruker MedSpec Biospin MR scanner together with an eight-channel birdcage head coil was used to collect whole-brain images. T2*-weighted gradient-recalled echo-planar imaging sequences were used to acquire the functional images with the same parameters for the functional localizers and the main experiment (34 axial slices; voxel dimensions 3 × 3 × 3 mm; TR/TE = 2000/33 ms; flip angle = 73 deg; 64 × 64 matrix; FOV = 192; gap size = 1 mm). Structural images were acquired with an MP-RAGE sequence with 1 × 1 × 1 mm resolution.
fMRI localizer stimuli and tasks.
Regions of interest (ROIs) selective to human bodies, EBA (Downing et al., 2001) and FBA (Peelen and Downing, 2005; Schwarzlose et al., 2005), were identified by contrasting responses to static headless bodies with responses to chairs (see Fig. 2a,c). Stimuli (40 exemplars per category) measured 8° by 6° visual angle and were presented in blocks of 14 s duration, with a total of 21 blocks. Blocks 1, 6, 11, 16, and 21 were fixation-only baseline epochs, with the other blocks showing images of either bodies or chairs in alternating order. Each block was comprised of 20 individual images, presented for 300 ms, and segregated by a blank screen for 400 ms. All images appeared against a white background and were position jittered (maximal displacement of 2° in both dimensions). To maintain attention, participants performed a one-back task, pressing a button when a picture was repeated sequentially. Performance on the one-back task was virtually perfect. Across blocks, the number of repetitions varied at random between two and three times. A central red fixation dot was presented throughout the run. Participants performed two runs.
pSTS and hMT+ were localized with data from a localizer experiment consisting of three conditions: intact point-light actions, position-scrambled point-light controls, and static frames of the scrambled point-light control condition (see Fig. 2a,c, respectively). For the dynamic conditions, seven 1 s intact point-light animations or their position-scrambled controls, were randomly selected from a database of 25 complex actions (walking, jumping, climbing stairs, etc.; stimuli provided by E. Grossman, University of California, La Jolla, CA) with a 1 s blank interstimulus interval amounting to a total block duration of 14 s. Each run consisted of a total of 25 blocks, 18 stimulus blocks lasting 14 s and seven fixation blocks (1,5,9,13,17,21 and 25) lasting 8 s. In the static scrambled blocks, a randomly chosen frame from a randomly chosen position-scrambled action was presented for 300 ms and followed by a 700 ms blank screen. This occurred 14 times within each block. A central red fixation dot was presented throughout the whole trial and the target images were position jittered (maximal displacement of 2° in both dimensions) on each trial. To maintain attention, participants performed a one-back task (one or two sequential repetitions in the dynamic blocks and two or three sequential repetitions in the static blocks). Participants performed three runs.
To localize pSTS, blocks of intact point-light actions were contrasted with position-scrambled dynamic controls (scrambling the starting position of each dot, keeping the local motion vectors unaltered), following previous work (Grossman et al., 2000). To localize hMT+, dynamic scrambled actions were contrasted with static frames of these scrambled actions. Because scrambled biological motion displays were created by randomizing the starting position of each dot, there were no body parts or body actions visible in the hMT+ localizer. That is, the dynamic scrambled condition appeared as a set of randomly moving dots and the static frames as static random dot patterns. Thus, our localizer was similar to previous studies that localized hMT+ using contrasts between meaningless motion stimuli and static controls, using a variety of stimulus types including dots moving in one direction (Huk et al., 2002) or in multiple directions (Beauchamp et al., 1997).
fMRI main experiment stimuli and task.
The main experiment consisted of four dynamic point-light walker conditions, which differed in their facing direction (left or right) and walking direction (forward or backward; Fig. 1a), and two static conditions (facing left or right; see Fig. 4a). For the static conditions, the points were connected with lines to emphasize the underlying body pose. Static body poses were extracted from the walking cycles of the dynamic actions.
Stimuli were presented slightly eccentric at 1.5° of visual angle in either the left or right visual field while participants were instructed to maintain fixation on the centrally presented red fixation dot. Visual field presentation was balanced within each run. Displays of all conditions were drawn randomly from a collection of 6 actors walking at two different speeds (Vangeneugden et al., 2010; selected speeds: 4.2 and 6 km/h; i.e., treadmill speed). To maintain attention, participants had to indicate with a button press whether the exact same action was repeated, i.e., same actor walking at the same speed, or, for static conditions, whether the same static frame was shown twice (one-back task). This could happen one or two times within the dynamic blocks and two or three times within the static blocks. Each run consisted of 29 blocks lasting 14 s each. Blocks 1, 8, 15, 22, and 29 were fixation-only epochs. Dynamic blocks contained seven walkers, each presented for 1 s followed by a 1 s blank interstimulus interval while 14 different poses were presented in each static block (for 633 ms, followed by a 367 ms blank).
Preprocessing and data analyses.
Standard data preprocessing and statistical analysis was done using the Statistical Parametric Mapping package (SPM8, Wellcome Department of Cognitive Neurology, London). Preprocessing steps consisted of correction of slice timing, realignment to the mean of the images to correct for motion, coregistration of anatomical images to the functional images and subsequent reslicing, segmentation of the resulting anatomical images, spatial normalization of the realigned functional images and the resliced anatomical runs to the Montreal Neurological Institute) template. No spatial smoothing was applied.
ROI definition.
ROIs were defined in individual participants. A (headless bodies > chairs contrast at a threshold of t = 3.11; p < 0.001 uncorrected) was used to define EBA and FBA, located in the inferior temporal sulcus (ITS) and the posterior fusiform gyrus, respectively (Fig. 2a,c). hMT+ was localized with the contrast (dynamic scrambled actions > static scrambled body poses), using a threshold of t = 3.11 (p < 0.001 uncorrected; Fig. 2c). Because EBA and hMT+ overlap substantially (Downing et al., 2007), voxels that were selective for both bodies and motion (and that would thus be included in both EBA and hMT+) were excluded from the EBA and hMT+ ROIs. This led to the exclusion of 55.2% (SD = 22.8) of EBA voxels and 54.5% (SD = 17.9) of hMT+ voxels. The exclusion of voxels selective for both bodies and motion was done to increase sensitivity to possible differences in information carried by body-selective and motion-selective voxels. Specifically, as we were interested in information carried by body-selective voxels, we wanted to minimize contamination by motion-selective voxels. It should be noted that because of the exclusion of voxels common to both ROIs, these ROIs are not directly comparable to EBA and hMT+ ROIs of previous studies, which typically included overlapping voxels. To indicate this difference, we labeled these ROIs EBA* and hMT+* (following Schwarzlose et al., 2005, 2008).
pSTS was defined by the contrast (intact actions > scrambled actions), at a threshold of t = 2.34 (p < 0.01 uncorrected), which was necessary to define this region in all participants (Fig. 2a). Only activated voxels in the posterior part of the superior temporal sulcus were included in the ROI.
The mean cluster sizes of the ROIs were (number of voxels): EBA* (752), FBA (512), hMT+* (777), and pSTS (1191).
Multivoxel pattern analyses.
For all six conditions in the main experiment we extracted the response pattern (parameter estimates) for all voxels being part of one of the localized ROIs (Fig. 1a). We incorporated voxels from both hemispheres but we also examined the effects separately for each hemisphere. This procedure was applied separately for each of the four runs for each participant individually. For each voxel, the mean response across all conditions was subtracted from the response to each of the conditions, separately for each run. Responses for the two odd and two even runs were averaged separately whereupon the values were correlated and Fisher transformed (0.5 × log[(1 + r)/(1 − r)], with r = correlation), resulting in an asymmetrical 6 × 6 correlation matrix.
The amount of body form information present within each ROI was defined as the difference between the average correlation of conditions having similar facing orientations (e.g., facing left forward and facing left backward) and conditions having different facing orientations (e.g., facing left forward and facing right forward; Fig. 1b). Such a form index was also calculated separately for stimuli having either similar or different walking directions (Fig. 3b). We also calculated multiple body-motion indices, representing the amount of body motion information present within an ROI, by subtracting the average correlation of conditions having different walking directions from the average correlation of conditions having similar walking directions, across or separately for conditions with similar or different facing orientations (Fig. 3b). Differences between voxelwise correlations were then tested using repeated-measures ANOVAs and t tests (two-tailed) with participant as random factor. The general form and motion indices were also calculated for additional regions of interest FBA and hMT+*.
We also computed a third set of indices (form generalization indices) that compared correlations across static and dynamic conditions (Fig. 4b). More specifically, we contrasted the correlation between conditions having the same body orientation with the correlation between conditions differing in body orientation, with all correlations computed across static and dynamic conditions.
Standard GLM analysis.
We assessed the effect of stimulus type (dynamic actions or static bodies) on the average response magnitude (percentage signal change) of each ROI and each hemisphere. The activity vectors for all four dynamic conditions and the two static conditions were pooled. For each area separately (EBA*, FBA, hMT+*, and pSTS) we ran a two (stimulus type: dynamic or static) by two (hemisphere: right or left) repeated-measures ANOVA.
Psychophysical experiment
Stimuli.
All stimuli were presented on a 19 inch LCD monitor set at 1280 × 1024 resolution at a refresh rate of 60 Hz. A chinrest was placed 57 cm in front of the monitor. The centers of elongated ellipses were placed at the locations of the major joints of a walker (see Fig. 6a). We manipulated the orientation of the ellipses relative to the underlying body posture, being either aligned or misaligned. In the aligned conditions, the orientation of ellipses was consistent with the overall body from. Misaligned ellipses were rotated by 45° relative to the invisible line connecting two dots belonging to the same body part. We reasoned that such a manipulation would severely disrupt form processing because the formation of the body Gestalt is hindered, while motion trajectories of individual ellipses remain unaltered (Thirkettle et al., 2010; Poljac et al., 2011; Thurman and Lu, 2013). Given the changing body posture over time, orientations of the ellipses were updated every frame. The movement patterns of 13 ellipses defining the walker were adopted from Vanrie and Verfaillie (2004). The walkers walked on the frontoparallel plane as if on a treadmill. Individual frames were mirror-flipped along the vertical axis to create different facing directions. Backward sequences were generated by reversing the temporal order of the frames of the forward walker. The starting frame of each movie was randomly selected on every trial. We manipulated the presentation duration and the number of ellipses, depending on the task. Because we always presented fewer than the total amount of ellipses, we used the limited-lifetime technique in which each ellipse could be randomly allocated every 100 ms to any of the 13 joint locations (Beintema and Lappe, 2002). The walker covered a 10° vertical by 5.8° visual angle at the maximum lateral extension of the ankles. The constituent ellipses were defined by orthogonal Gaussian axes with SDs of 13.7 and 3.2 arcmin. A small red fixation dot was presented on the center of the screen throughout the entire trial while position of the stimulus was jittered (maximum displacement of 2° in both dimensions) from trial to trial. We collected 20 repetitions for each stimulus condition.
Tasks.
We used two discrimination tasks: a facing orientation task and a walking direction task requiring body form or body motion discrimination, respectively. In the facing task, participants had to report the facing orientation of a forward walking figure (i.e., facing to the left or to the right), while they had to report the walking direction (forward or backward) of a leftward facing walker in the walking direction task. In the facing task we presented walkers composed of 1, 2, 3, or 6 ellipses and we used four exposure times: 17, 33, 50, or 100 ms. In the walking direction task we presented on average more ellipses (3, 6, 8, or 12) for a longer exposure time: 33, 100, 200, or 300 ms. These parameters were chosen based on pilot studies where we adjusted stimulus parameters to achieve a useful dynamic range in performance (data not shown).
The walkers were preceded by a small red fixation dot presented for 500 ms. Participants were instructed to keep fixation throughout the trial and to respond as accurately as possible. Feedback was provided on each trial. We presented the stimuli in four different blocks obtained by combining task (facing or forward/backward) and ellipse orientation (aligned or misaligned). The order was counterbalanced across participants and divided in two sessions on different days.
Data analysis.
The psychophysical data were analyzed by means of a four-way repeated-measures ANOVA with the following within-subjects factors and associated levels: two tasks (facing orientation and walking direction), two types of alignment (aligned and misaligned ellipses), four different number of ellipse conditions (variable number depending on the task), and four different stimulus presentation durations (variable durations depending on the task).
TMS
Stimuli and tasks.
Participants underwent the TMS experiment after having completed the fMRI and the psychophysical experiments. The TMS experiment consisted of two separate sessions conducted on different days. In each session either EBA or pSTS was stimulated, counterbalanced across participants (see Fig. 7a). Based on the participants' performance (see above, Psychophysical experiment) we selected two stimulus conditions, i.e., a certain number of ellipses and certain presentation duration, yielding an “easy” (75–80% correct) and a “hard” condition (65–70% correct). This was done by fitting the data on the aligned ellipse walkers with a cumulative Gaussian and selecting stimulus parameters that yielded 75–80% and 65–70% accuracy ranges. We did this to increase our chances of avoiding ceiling and floor effects in performance. The results revealed that both easy and hard conditions were within the useful performance range for all of the participants. These conditions were then presented in either the facing orientation or walking direction task. Only walkers with ellipse orientations aligned to the underlying body pose were presented during the TMS experiment. On each trial, stimulus position was jittered by randomly varying the position of the stimulus center within a small window (1.5° × 1.5°) around the fixation spot. Furthermore, the initial starting frame was randomized for each trial while the same limited-lifetime technique as in the psychophysical experiment was applied here.
Stimulation parameters and site localization.
We used an offline TMS paradigm with pulses delivered through a figure-eight coil having a wing diameter of 70 mm via a Magstim 2T Rapid stimulator. We applied a 20 min train of repetitive TMS pulses delivered at 1 Hz. Stimulation was set to 70% of the maximum stimulator output. Our selection of fixed stimulation intensity for all participants was motivated by a previous successful study applying TMS pulses over similar areas (Grossman et al., 2005). High-resolution functional images were overlaid onto the anatomical images, obtained from the fMRI experimental session using a frameless stereotaxy system (BrainSight, Rogue Research). A 3D-anatomical reconstruction was used to visualize the Talairach coordinates of the projected cortical target of the pSTS and EBA stimulation sites in all participants. Average Talairach coordinates of the stimulation site across participants was (55, −54, 11) and (50, −68, 3) for pSTS and EBA, respectively. The neuronavigation system provided online feedback on the position of the coil relative to the area of interest throughout the entire stimulation session. Deviations from the targeted focus were minimized and typically fell <1 mm. Participants were provided with earplugs to minimize the noise discomfort produced by the TMS machine.
Procedure.
Participants' performance was measured before TMS (premeasurement: PRE), immediately after TMS (TMS measurement) and after a 1 h break (postmeasurement: POST; see Fig. 7a). Each test lasted ∼12 min and consisted of four blocks of randomly alternating 1–1.5 min long tasks (facing orientation or walking direction) with easy and hard trials intermixed within each block.
Instructions on the screen informed the participants when the tasks were changed. Within each task, we presented 48 repetitions of both the easy and hard stimulus conditions, e.g., 12 trials showing an easy forward walker and 12 trials showing an easy backward walker. This resulted in 576 trials per task (192 trials for pre-TMS, TMS, and post-TMS measurements) per participant in each TMS session (1152 trials in total). The trial structure during the TMS experiment was identical to the structure used in the psychophysical experiments (see above, Psychophysical experiment) with the exception that no feedback on correct responses was provided.
Data analysis.
A two-way repeated-measures ANOVA with TMS condition (TMS over right EBA or right pSTS) and task (facing orientation or walking direction) as within-subject factors was used to analyze the data of the TMS experiment. For this analysis, we only incorporated the data obtained directly after the 20′ stimulation (“TMS” period). We also ran similar analyses for both the easy and hard conditions separately. Next, we specifically compared performance levels in the different epochs (PRE vs TMS and POST vs TMS) with a Bonferroni's correction for multiple comparisons.
Results
fMRI evidence for a double dissociation of neural mechanisms involved in body form and body motion processing
We used fMRI and multivoxel pattern analysis to test whether dissociated neural mechanisms underlie body form and body motion processing. Specifically, we asked whether multivoxel patterns in EBA selectively carry information about the form of the perceived body action and whether multivoxel patterns in pSTS selectively carry information about the motion of the perceived body action.
Participants (N = 12) viewed whole-body actions presented as point-light walkers, a stimulus that strongly conveys body actions with minimal visual cues (Johansson, 1973). Crucially, the stimuli varied on two dimensions, which were orthogonally manipulated. The form dimension was manipulated by changing the facing orientation of the walker: whether the body of the walker was facing leftward or rightward. The motion dimension was manipulated by changing the walking direction of the walker: whether the walker was walking forward or backward (Lange and Lappe, 2006; Vangeneugden et al., 2011; Fig. 1a). Participants viewed these brief (1 s) movies while performing a one-back repetition detection task. Body form information was measured as the degree to which multivoxel activity patterns discriminated between leftward versus rightward oriented walkers, whereas body motion information was measured as the degree to which multivoxel activity patterns discriminated between forward versus backward walking walkers. Regions that represent facing orientation (leftward vs rightward) should show relatively similar activity patterns to walkers facing the same orientation (e.g., leftward), even when these actors walk in opposite directions (forward vs backward). In contrast, regions that represent walking direction should show relatively similar activity patterns to walkers moving in the same direction (e.g., forward), even when these walkers are oriented in opposite directions (leftward vs rightward).
Multivoxel pattern analysis
Our main analysis focused on EBA and pSTS regions of interest, which were functionally localized using independent localizers (see Materials and Methods). Motion-selective voxels were excluded from EBA (indicated by the label EBA*). Pattern similarity was computed by correlating activity patterns across two halves of the data (odd vs even runs). Subsequently, two indices were computed (Fig. 1b) to capture the amount of body form and body motion information contained in multivoxel patterns of activity. The form index was computed by contrasting correlations between conditions that had the same facing orientation with correlations between conditions that had a different facing orientation. Analogously, the motion index was computed by contrasting correlations between conditions that had the same walking direction with correlations between conditions that had a different walking direction.
A two-way repeated-measures ANOVA with ROI (EBA*, pSTS; Fig. 2a) and Index (form, motion) as factors revealed a highly significant interaction (F(1,11) = 227.08, p < 0.0001), reflecting a double dissociation between the information types contained in the two ROIs (Fig. 2b). Information useful to discriminate between actions having different facing orientations was significant in EBA* (t test, p < 0.0002) but not in pSTS (t test, p = 0.58), while information to discriminate between different walking directions was significant in pSTS (t test, p < 0.0001) but not in EBA* (t test, p = 0.25). Information in nearby regions of interest hMT+* and FBA (Fig. 2c) could not account for the information found in EBA* and pSTS (Fig. 2d). In FBA we did not observe any difference between the amount of form and motion information (paired t test, p = 0.58). Moreover, both indices did not show any significant amount of discriminable information (t tests, p > 0.26). The difference between the amount of form and motion information did reach significance in hMT+* (paired t test, p < 0.01), mainly due to a marginally significant negative form index (t test, p < 0.05).
We obtained similar results when analyzing both hemispheres separately. A three-way repeated-measures ANOVA with Hemisphere (left, right), ROI (EBA*, pSTS), and Index (form, motion) did not indicate an interaction of Hemisphere with the two-way interaction between ROI and Index (F(1,11) = 0.025, p = 0.878; Fig. 3a). Again we noted a significant amount of form information in both left and right EBA* (t tests, p < 0.01), but not in left or right pSTS (t tests, p > 0.57). By contrast, the amount of motion information was significant in both left and right pSTS (t tests, p < 0.0001), not significant in left EBA* (t test, p = 0.8), and only marginally significant in right EBA* (t test, p = 0.045).
Furthermore our results did not depend on how we calculated our indices as we observed similar effects when calculating the form index only for conditions that differed in walking direction and the motion index only for conditions that differed in facing direction (Fig. 3b). Useful information to discriminate between actions with different facing orientations independent of walking direction was found in EBA* (t tests, p < 0.012), but not in pSTS (t tests, p > 0.59), whereas the opposite pattern, useful information to discriminate between actions with different walking directions independent of facing orientation, was found in pSTS (t tests, p < 0.0006) but not in EBA* (t tests, p > 0.203; Fig. 3c).
In addition to the four conditions used to calculate the form and motion indices, the experiment also included leftward and rightward facing static bodies in which the dots were connected by lines to increase the salience of the underlying body posture (Fig. 4a). This allowed us to compute a third index, the form generalization index, which compared correlations across static and dynamic conditions. In other words, if a region is coding for a specific body posture, it should represent the same posture regardless of how it is defined, thus exhibiting cue invariance. To extract the generalization index, we contrasted correlations between static and dynamic conditions that had the same facing orientation with correlations between static and dynamic conditions that had a different facing orientation (Fig. 4b). This analysis closely replicated the form index results, showing significantly more form information in EBA* than in pSTS (paired t test, p < 0.00001), with significant information about facing orientation contained in EBA* (t test, p < 0.00001) but not in pSTS (t test, p = 0.46; zfr;4Fig. 4c). Activity patterns in FBA also carried significant information about facing orientation (t test, p < 0.0005), whereas this was not the case for hMT+* voxels (t test, p = 0.62; Fig. 4d).
Univariate analysis
In line with recent findings (Pitcher et al., 2011; Thompson and Baccus, 2012) the average response magnitudes within the ROIs reliably discriminated dynamic from static conditions. For each ROI we compared the responses to the pooled static bodies (left and right) and the pooled dynamic actions (four actions) using paired t tests, separate and pooled over both hemispheres. We observed significant effects with greater responses to static bodies over dynamic actions in EBA* and FBA in both hemispheres separately and pooled (t tests, p < 0.013). The opposite preference was noted in hMT+* and pSTS, for both hemispheres pooled (t tests, p < 0.015) and separately in left hemisphere for hMT+* (t test, p < 0.001), but not in right hemisphere (t test, p = 0.068), and for pSTS in right (t test, p = 0.01) but not left (t test, p = 0.2) hemisphere (Fig. 5a).
To test whether average response magnitude in our ROIs differed between the 4 dynamic conditions, we ran two-way repeated-measures ANOVAs with Facing orientation (left, right) (Fig. 1a) and walking direction (forward, backward) as factors on the average responses magnitudes extracted for each hemisphere separately or on the responses pooled over hemispheres. No main effects or interactions were significant in any of the ROIs when averaging across hemispheres (all p >.1). When analyzing both hemispheres separately, we only observed a significant main effect of Facing orientation for left EBA* (F(1,11) = 12.59, p < 0.005) with a slightly greater response for the leftward facing walkers over the rightward facing walkers (Fig. 5b). No other effects were significant. Thus, in contrast to multivoxel pattern analysis, analysis of overall response magnitude was generally insufficient to differentiate between subtle stimulus differences, such as dynamic actions having different facing orientations (as a cue for form processing) or walking in different directions (as a cue for motion processing).
Causal evidence for a double dissociation of neural mechanisms involved in body form and body motion discrimination
The fMRI results provided strong evidence that EBA and pSTS represent different properties of the same perceived body action. Next, we asked whether the representations uncovered in these regions causally contribute to behavioral discriminations of body form and motion.
A critical first step in addressing this question was the development of a behavioral paradigm that allows for sensitive measures of both body form and body motion discrimination within the same stimulus set. We achieved this by comparing behavioral performance for the same dynamic stimuli but using different tasks (Beintema and Lappe, 2002; Vangeneugden et al., 2011). To validate the use of these tasks as revealing form versus motion processing, we developed a new stimulus set in which form and motion were varied independently (Fig. 6a). Walkers consisted of a varying number of small ellipses. Form information was manipulated by varying the alignment of the ellipses that made up the walker. By misaligning the ellipses, we distorted the overall shape of the walker while leaving the movement trajectories intact (Poljac et al., 2011). Motion information was manipulated by varying the presentation duration (and thus the amount of motion) of the display (Neri et al., 1998). To manipulate the overall task difficulty, we additionally varied the number of presented ellipses. Finally, to further separate form and motion cues, walking direction was fixed during the form task (always forward), whereas facing orientation was fixed during the walking direction task (always leftward) (Lange and Lappe, 2006).
Results of the psychophysical experiment (Fig. 6b) showed that the misalignment of the ellipses had a stronger effect on the facing orientation task (discriminating body form: facing leftward or rightward) than on the walking direction task (discriminating body motion: walking backward or forward; interaction between alignment and task: F(1,11) = 26.8, p < 0.0005). In contrast, presentation duration had a pronounced effect on body motion discrimination while having little effect on body form discrimination (interaction between duration and task: F(3,33) = 48.7, p < 0.00001). These results indicate that body form and body motion discrimination rely on different visual cues: reliable shape information is crucial for discriminating body posture but has virtually no effect on discriminating the direction of body movements. Conversely, motion information is crucial for discriminating the direction of body movements but only mildly affects the discrimination of body form.
We next used the behavioral data to select stimuli that would allow for a sensitive test of whether EBA and pSTS are causally involved in discriminating body form and body motion. Specifically, for each participant and task, we used results of the behavioral experiment to select two groups of stimulus parameters (duration, number of ellipses) that yielded behavioral performance between 65 and 70% correct (hard condition) and between 75 and 80% (easy condition). This ensured that baseline performance was roughly equated across tasks for all participants and that performance was neither at floor nor at ceiling (see Materials and Methods). Offline repetitive TMS (1 Hz) was used to selectively interfere with neural activity in EBA or pSTS, previously localized in each individual during the fMRI experiment. Previous studies have shown that low-frequency 1 Hz TMS temporarily reduces excitability of the cortex within the stimulated area and that this effect outlasts the period of stimulation as measured behaviorally (Battelli et al., 2009; Tadin et al., 2011). Stimulation was delivered for 20 min over right EBA and, in a separate session, to right pSTS (Fig. 7a). Participants performed the exact same tasks as those used in the psychophysical experiment, discriminating facing orientation and walking direction.
Similar results were observed for the easy and the hard conditions, and thus we first averaged performance across these two conditions. A two-way repeated-measures ANOVA with TMS condition (TMS over EBA, TMS over pSTS) and task (facing orientation, walking direction) as within-subject factors revealed a highly significant interaction (F(1,11) = 43.00, p < 0.0001), reflecting a double dissociation: TMS over EBA disrupted the form discrimination task significantly more than TMS over pSTS (paired t test, p < 0.05), whereas TMS over pSTS disrupted motion discrimination significantly more than TMS over EBA (paired t test, p < 0.0001; Fig. 7b). Relative to the preintervals and postintervals, TMS over EBA significantly reduced performance on the form discrimination task (Bonferroni corrected post hoc comparisons, both p < 0.0005). Conversely, TMS over pSTS had a strong detrimental effect on motion discrimination (Bonferroni corrected post hoc comparisons, both p < 0.001). Finally, when considered separately, both easy and hard conditions revealed significant interaction effects (F(1,11) = 65.5, p < 0.00001 and F(1,11) = 10.67, p < 0.01, respectively), mirroring the double dissociation found for the average of the two conditions.
Discussion
The current study provides converging evidence from fMRI, psychophysical, and TMS experiments for a double dissociation between mechanisms underlying body form and body motion discriminations. To allow for direct comparisons of body form and body motion discriminations, we operationalized these cues by comparing activity patterns between different actions (fMRI) and by investigating performance levels in different tasks for the same stimuli (psychophysics and TMS). In an fMRI experiment, distinct brain regions were found to represent form and motion: multivoxel response patterns in EBA carried information about the body posture of the observed point-light walker but not of its motion, whereas multivoxel response patterns in pSTS carried information about the motion direction, but not the body posture, of the same stimulus. Notably, EBA activity exhibited cue invariance, representing body posture invariantly of the way body form was depicted. In the psychophysical experiment, we found that misaligning the ellipses that made up a walker stimulus disrupted body posture discriminations, although having little effect on body motion discriminations. Conversely, varying the presentation duration of the walker primarily affected body motion discrimination (Neri et al., 1998). Finally, TMS showed that EBA and pSTS are causally involved in discriminating body posture and body motion, respectively. Together, these findings further our understanding of the functionality of two key areas of the social brain and provide important constraints for models of body action perception.
Our results are in support of biological motion perception models that assume separate form and motion pathways (Giese and Poggio, 2003). For example, the intact behavioral performance in the motion discrimination task after disrupting form processing (by misaligning the orientation of the ellipses) suggests that body motion discrimination does not rely on input from form-processing pathways alone (but see Beintema and Lappe, 2002; Lange and Lappe, 2006). Similarly, the intact behavioral performance in the motion discrimination task after TMS over EBA (a stimulation that affected form processing) implies that body motion discrimination does not rely on form-related input from EBA (Vangeneugden et al., 2009, 2011). Therefore, the current results favor action perception models that include distinct form and motion pathways over models in which action perception relies solely on integrating sequences of body postures (Thompson et al., 2005; Lange and Lappe, 2006; Singer and Sheinberg, 2010). More broadly, these findings are consistent with a functional separation of form and motion processing. Our results show that this functional separation occurs even for body action perception; a task where form and motion are closely linked. This conclusion is consistent with a recent individual differences study (Miller and Saygin, 2013). Specifically, Miller and Saygin (2013) found that sensitivity to form cues in biological motion processing was correlated with scores on social cognition questionnaires (e.g., empathy), whereas sensitivity to motion cues was correlated with scores on motor imagery questionnaires.
Body-selective EBA and motion-selective hMT+ are located near to each other and partially overlap (Downing et al., 2007; Peelen et al., 2006; for detailed analyses, see Ferri et al., 2012; Weiner and Grill-Spector, 2011). Because of this overlap, we included a motion-selective cortex localizer to show that body form information in EBA reflected effects in body-selective cortex rather than in overlapping motion-selective cortex. This was achieved by excluding motion-selective voxels from EBA and by showing differential effects in body- and motion-selective regions (Figs. 2, 4). In the TMS study, although we targeted EBA, fully separating EBA and hMT+ is not possible due to limited spatial resolution of TMS. Thus, it is possible that TMS over EBA also resulted in a modest disruption of motion processing in hMT+. This is relevant as hMT+ is believed to feed into pSTS (Grossman et al., 2000). Given that hMT+ performs more basic visual motion analysis than pSTS, this would lead to the prediction that EBA stimulation might not only impair performance on the facing orientation (body form) but, as a result of indirect effects on pSTS, also impair the walking direction (body motion) task. In contrast, our results showed that EBA stimulation only affected the facing orientation task. This result is consistent with a TMS study that investigated the roles of hMT+ and pSTS in biological motion (Grossman et al., 2005) and found that, unlike stimulation of pSTS, stimulation of hMT+ did not lead to impairments in point-light biological motion detection.
There are several reasons why hMT+ stimulation may not lead to impaired biological motion discriminations (Grossman et al., 2005). First, it is likely that a (relatively small) loss in local motion sensitivity does not significantly impair more global biological motion sensitivity, Indeed, adding noise to biological motion displays leaves discrimination of global biological motion perception relatively intact (Neri et al., 1998). Consistent with this argument, our behavioral study showed that changes in the local features of the walker (the orientation of the ellipses) did not strongly affect walking direction judgments. Furthermore, right pSTS is thought to receive input from both left and right hMT+ (Grossman et al., 2000), with unilateral TMS leaving input from the nonstimulated hMT+ unaffected. Future studies using bilateral hMT+ stimulation could address this possibility.
More generally, the vicinity and partial overlap of body-form-selective EBA and motion-selective hMT+ suggests that this region as a whole cannot be straightforwardly characterized as belonging either to a form/ventral stream pathway or a motion/dorsal stream pathway (Kravitz et al., 2013). Indeed, hMT+ and EBA also closely overlap with object-form-selective lateral occipital complex (Kourtzi et al., 2002; Denys et al., 2004; Downing et al., 2007; Kolster et al., 2010). Further research is needed to understand the functional interactions between these regions, for example during the processing of motion-defined shapes (Peelen et al., 2006; Farivar et al., 2009), the integration of form and motion signals, or the extraction of 3D shape from disparity and motion cues (Orban, 2011).
Social interactions depend on correctly perceiving subtle dynamics in facial and bodily movements (Aviezer et al., 2012), which reveal the emotions, dispositions, and intentions of others. Our cross-methodological results (correlational, behavioral, and causal) suggest that the role of pSTS in social cognition may particularly relate to the processing of such dynamic social information. This suggestion dovetails with previous theoretical (Allison et al., 2000; Haxby et al., 2000), electrophysiological (Oram and Perrett, 1996; Vangeneugden et al., 2009, 2011), imaging (Grossman and Blake, 2002; Thompson and Baccus, 2012), stimulation (Grossman et al., 2005; van Kemenade et al., 2012), and lesion studies (Saygin, 2007) that propose that pSTS is a crucial region in body motion perception. Moreover, our results suggest that the EBA is primarily involved in the static analysis of body form, consistent with previous theoretical (Downing and Peelen, 2011), fMRI (Michels et al., 2005; Downing et al., 2006; Peelen et al., 2006), TMS (Urgesi et al., 2004, 2007; Pitcher et al., 2009), and lesion (Moro et al., 2008) studies. Importantly, to counter potential differences in attention and low-level characteristics when comparing static versus dynamic displays, we operationalized both dimensions as dynamic displays but varied the cues useful for their discrimination, unlike most previous studies. Moreover with these stimulus manipulations we ascertained that we were not confounding both cues, as it has been shown that static bodies can imply motion (Kourtzi and Kanwisher, 2000). Thus, the current study provides the first causal evidence of a double dissociation between EBA and pSTS, linking these regions to the analysis of distinct properties of the same body action. An analogous double dissociation exists between FFA and pSTS during face perception (Pitcher et al., 2011); however, future experiments will need to clarify the causal nature of this particular dissociation.
Interestingly, distinct clinical profiles have been associated with abnormalities to EBA and pSTS. Eating disorders have been associated with abnormalities in EBA (Suchan et al., 2010), whereas autism spectrum disorders have been linked to pSTS abnormalities (Kaiser and Shiffrar, 2009) or dysfunctional interconnectivity between pSTS and EBA (McKay et al., 2012). The current results shed new light on these findings, suggesting differential importance of motion and form perception impairments in different clinical disorders. Moreover, our novel psychophysical paradigm (Fig. 6), which allows for a detailed and independent assessment of motion and form processing abilities, might be a useful tool for investigating autism, eating disorders, and other social cognitive disorders. By allowing a separate assessment of body form and body motion perception, our approach may potentially provide indications on preserved and affected functions in pathological populations.
Finally, our study also makes a novel methodological contribution by combining MVPA and TMS. Multivoxel pattern analysis is an increasingly popular tool for analyzing fMRI data. However, the underlying neural changes that lead to the observed information in multivoxel patterns are unclear (Op de Beeck, 2010), and the contribution of multivoxel information to behavior has been questioned (Todd et al., 2013). Our novel combined MVPA-TMS approach provides important evidence that regions carrying information about specific stimulus properties can be shown to be causally involved in the behavioral discrimination of these properties.
To conclude, the present multimethod study provides strong converging evidence that neural mechanisms processing body form and body motion are dissociable and localized in the EBA and pSTS, respectively.
Footnotes
We thank J.A. Assad, R. Blake, P. Downing, J.A. Heimel, A.P. Saygin, and J. Thompson for comments and discussions on an earlier version of the paper.
- Correspondence should be addressed to Dr. Joris Vangeneugden Department of Cortical Structure and Function, Netherlands Institute for Neuroscience, Meibergdreef 47, 1105 BA, Amsterdam, The Netherlands. j.vangeneugden{at}nin.knaw.nl