Abstract
Many functional neuroimaging studies of biological motion have used as stimuli point-light displays of walking figures and compared the resulting activations with those evoked by the same display elements moving in a random or noncoherent manner. Although these studies have established that biological motion activates the superior temporal sulcus (STS), the use of random motion controls has left open the possibility that coordinated and meaningful nonbiological motion might activate these same brain regions and thus call into question their specificity for processing biological motion. Here we used functional magnetic resonance imaging and an anatomical region-of-interest approach to test a hierarchy of three questions regarding activity within the STS. First, by comparing responses in the STS with animations of human and robot walking figures, we determined (1) that the STS is sensitive to biological motion itself, not merely to the superficial characteristics of the stimulus. Then we determined that the STS responds more strongly to biological motion (as conveyed by the walking robot) than to (2) a nonmeaningful but complex nonbiological motion (a disjointed mechanical figure) and (3) a complex and meaningful nonbiological motion (the movements of a grandfather clock). In subsequent whole-brain voxel-based analyses, we confirmed robust STS activity that was strongly right lateralized. In addition, we observed significant deactivations in the STS that differentiated biological and nonbiological motion. These voxel-based analyses also revealed regions of motion-related positive activity in other brain regions, including MT or V5, fusiform gyri, right premotor cortex, and the intraparietal sulci.
Introduction
Neuroimaging research indicates that viewing human movements engages a part of the human visual system located in and near the superior temporal sulcus (STS) region (for review, see Decety and Grèzes, 1999; Allison et al., 2000). This area is anterior and superior to the more general motion-sensitive regions MT or V5 (MT/V5) (Zeki et al., 1991; Watson et al., 1993; McCarthy et al., 1995). Many previous studies of biological motion have used as stimuli point-light displays of ambulating figures and compared the resulting activations with those evoked by the same display elements moving in a random or noncoherent manner (Bonda et al., 1996; Howard et al., 1996; Grèzes et al., 2001; Grossman and Blake, 2001, 2002; Vaina et al., 2001). Although these studies have established that biological motion activates the posterior STS, the use of random motion controls has left open the possibility that coordinated and meaningful nonbiological motion might activate these same brain regions and thus question their specificity for processing biological motion.
Here we used functional magnetic resonance imaging (fMRI) to evaluate whether biological motion activated the STS region more than a meaningful and coordinated nonbiological motion. Animated figures conveyed four movement categories (see Fig. 1). One category was a human figure viewed in profile and walking in place. Another was a collection of cylinders comprising a “robot” that walked with the same amplitude and speed as the animated human figure. A third category involved the same cylinders used for the robot, rearranged into a nonbiological “mechanical” form. The components moved with the same amplitude and speed as in the robot, but the perceived motion was disjointed and nonbiological. The fourth category was a grandfather clock, a familiar nonbiological mechanical device composed of several anthropomorphic features such as a clock “face” with moving parts and a pendulum that swung like a leg and with component motions that were coordinated and purposeful. We reasoned that a region tuned to biological motion should activate more in response to observing the walking human figure and the robot than to the clock or the disjointed mechanical cylinders.
Using a hypothesis-driven anatomical region-of-interest (ROI) approach, we tested a hierarchy of three questions regarding activity within the STS. First we compared responses to the human and the robot to determine whether the STS is sensitive to (1) biological motion itself or merely to the superficial characteristics of the stimulus. Then we tested whether the STS responds more strongly to (2) biological motion (as conveyed by the robot) than to a nonmeaningful but complex motion (the mechanical figure) and (3) a complex and meaningful nonbiological motion (the grandfather clock). In addition to these primary analyses, we performed voxel-based analyses to identify regions of motion-related activity in brain regions outside of the STS.
Materials and Methods
Subjects
Thirteen right-handed healthy subjects (seven females, six males) ranging in age from 20 to 27 years (mean of 23 years) provided written informed consent to participate in a study approved by the Duke University Medical Center Institutional Review Board. All had normal or corrected-to-normal visual acuity and were paid for participating.
Experimental design
We created four animated figures using the Poser 4.0 software program (Curious Labs, Santa Cruz, CA). These were a human, a robot, a mechanical assembly, and a grandfather clock (Fig. 1). In an event-related design, the four figures were always present, and on each trial, one of the four figures moved for 2 sec. Trials were separated by a 16 sec intertrial interval (ITI), during which all four figures were present on the screen and none were moving. The left to right order of the figures varied across runs, and the order of movements was randomized across trials. Over the course of 192 trials, subjects saw 48 exemplars of each category of motion. In one condition, the human, viewed in profile, walked in place as if on a treadmill (Fig. 1, top left). In another condition, the robot, composed of a sphere (torus) and four rods that simulated a head, torso and hips, two arms, and two legs, respectively, moved to simulate the sweeping of arms and legs and the sway of hips that comprise human walking (Fig. 1, top right). Each part of the robot moved as much as did its counterpart on the human figure (e.g., the arm of the robot swung to and fro the same distance and with the same angular relation to the shoulder as did the arm of the human, the torso swayed in a manner identical to the human's hips, and the legs swept the same space at the same velocity). The illusion of walking conveyed by the robot was quite compelling because we added a slight bounce to the sphere “head” and a sway to the torus “hips.” Thus, although the figures differed in form, their motions were nearly identical. The mechanical assembly was composed of pieces identical to those of the robot, but the configuration of pieces was different, as were the axes of rotation. The amount of movement made by the mechanical assembly was identical to the amount of movement made by the robot and human, thereby creating a good control for the motion of the robot (Fig. 1, bottom left). Finally, the grandfather clock had two moving hands and a pendulum below. The pendulum was the same size as the leg of the robot, and the amount of movement made by the clock was very similar to the amount of movement made by the other stimulus figures (Fig. 1, bottom right). We selected the grandfather clock because it shared several anthropomorphic features with the human (e.g., a clock face with moving parts) and the robot (e.g., a pendulum that swung like the arms or legs of the robot) and because it is a familiar device with meaningful and expected motions. In what follows, we use human, robot, mechanical, and clock as shorthand for the stimulus conditions. In addition, we created biological motion and nonbiological motion meta conditions by averaging the responses of voxels to human and robot (biological) and to clock and mechanical (nonbiological).
We used CIGAL (Voyvodic, 1999) to control stimulus presentation. Stimuli were back projected onto a translucent 56 × 66 cm screen placed at the feet of the subject using an LCD projector (XGA resolution, 900 lumens). Subjects viewed the stimuli through glasses with angled mirrors. Subjects were instructed only to attend to the screen at all times. Trials were randomized within runs lasting 6.5 min (24 trials per run). Each subject completed eight runs or 192 trials (48 trials per condition).
fMRI methods
MRI scanning was performed on a General Electric 4T LX NVi scanner system equipped with 41 mT/m gradients and a birdcage radio frequency (RF) head coil for transmitting and receiving (General Electric, Milwaukee, WI). Sagittal T1-weighted localizer images were first acquired and used to define a target volume for a semiautomated high-order shimming program. After shimming, the anterior commissure (AC) and posterior commissure (PC) were identified in the midsagittal slice for orienting the anatomical and blood oxygenation level-dependent (BOLD) contrast functional slice selection. A series of 60 high-resolution coronal T1-weighted images [repetition time (TR), 450 msec; echo time (TE), 20 msec; field of view (FOV), 24 cm; image matrix, 2562; slice thickness, 5 mm; in-plane resolution, 0.9375 mm] was acquired from posterior to anterior along the AC-PC line. Functional images were collected using the same slice prescription as the T1-weighted images, using a spiral imaging sequence sensitive to BOLD contrast [TR, 2.0 sec; TE, 30 msec; FOV, 24 cm; image matrix, 642; flip angle, 62°; slice thickness, 5 mm; in-plane resolution, 3.75 mm]. Each imaging run began with five discarded RF excitations to allow for steady-state equilibrium.
Data analysis
Our analytic strategy followed closely that used in previous studies from our laboratory (Jha and McCarthy, 2000; Yamasaki et al., 2002; Pelphrey et al., 2003) and consisted of a focused hypothesis-driven anatomical ROI approach supplemented with follow-up secondary and more exploratory voxel-based analyses. The centroid of whole-volume BOLD activation for each functional image volume within each time series was computed and plotted for each subject and imaging run. No subject had greater than a 3-mm deviation in the x-, y-, or z-dimensions. The MR signal for each voxel was temporally aligned to correct for the interleaving of slice acquisition within each TR. Temporal alignment was accomplished by fitting the time series of each voxel with a cubic spline and then resampling this function for all voxels at the onset of each TR. Epochs time-locked to stimulus onsets were extracted from the time series and averaged according to the four trial types, with the temporal order relative to stimulus onset maintained. The averaged epochs consisted of one image volume before (-2 sec) and seven image volumes after (2-14 sec) the onset (0 sec) of each stimulus event, for nine image volumes. The averaged MR signal time epochs were used in the analytic procedures described below.
Hypothesis testing within the STS anatomical ROI. Two research assistants who were blind to the subsequent statistical analyses of the data drew ROI on the anatomical images of each subject. ROI were traced on the left and right STS. Identification of anatomical landmarks and ROI was guided by human brain atlases (Roberts et al., 1987; Mai et al., 1997; Duvernoy, 1999). ROI file labels indicated the distance (in millimeters) posterior from the AC, facilitating registration of activity from similar ROI across subjects. The STS was traced on 14 slices ranging from 0 to 65 mm posterior from the AC (see Fig. 2, top right inset).
The average signal from all voxels within each ROI was computed for each of the nine time points within the averaged epochs and plotted to visualize the time course of the hemodynamic response (HDR) for each ROI during each stimulus condition. The HDR was examined separately for each slice and hemisphere within each ROI so that regional and stimulus condition-related effects in the form of the HDR could be evaluated. Averages of the change in signal intensity from baseline to 6 and 8 sec after stimulus onset were calculated for each condition as measures of waveform peak amplitude. Paired-sample t tests were performed to evaluate differences in this amplitude measure as a function of stimulus condition. These analyses, which allowed us to test an a priori defined set of hypotheses concerning amplitude differences as a function of stimulus condition, constituted our primary analysis of the data. ROIs were also used to group and count activated and deactivated voxels that were identified in a correlation analysis (described below).
Voxel-based analyses. We supplemented the primary ROI analysis with a correlation analysis to identify and count voxels for each stimulus condition within each ROI with a time course after stimulus that correlated significantly with an empirically defined HDR reference waveform. The reference waveform was the grand mean waveform representing the average HDR time course within seven slices (30-60 mm posterior from the AC) of the STS across conditions and subjects. We generated a t statistic for each voxel across runs by correlating the averaged (across runs) 16 sec MR signal time epochs (generated as described above) from each voxel with the reference waveform. T statistics were calculated from the correlation coefficients, and activated voxels were defined as those with suprathreshold t values, with the threshold for activation set at t > 1.96. Deactivated voxels (those with a negative-going response) were also identified, with the threshold for deactivation was set at t < -1.96. Counts of activated and deactivated voxels within each slice of the STS were converted to percentages relative to the number of voxels in that ROI.
To explore the extent to which populations of voxels demonstrated different patterns of activity as a function of stimulus condition, and to identify possible regions of activity outside of the anatomical ROIs that were the primary focus of our analysis, we performed voxel-based analyses on the across-subjects combined data. Across-subjects functional time course volumes and t statistic activation maps were calculated for each of the four original stimulus conditions and the computed biological and nonbiological meta conditions, combining data from all subjects. Before combining across subjects, we spatially normalized the images to a template image set from a randomly selected subject. Alignment factors for the functional images were calculated on a slice-by-slice basis using custom software (M. J. McKeown). This software implemented a nonlinear optimization of translation, rotation, and stretch values (6 parameters) on the basis of the cost function of maximizing the correlation between the (low-pass filtered and high-pass filtered) template slice and the to-be-normalized current slice. The normalization algorithm used the high-resolution anatomical images. Before normalization, the brain was extracted from the anatomical images of each subject to eliminate the influence of extraneous regions such as the skull and neck. The normalized individual t statistic maps were combined across subjects using a random-effects model (Lazar et al., 2002) implemented using a custom-written script for the MATLAB software (Mathworks, Natick, MA). The resultant statistical maps were threshold at a voxelwise uncorrected p < 0.001.
Results
Anatomical ROI analyses of the STS
We examined the grand average waveforms summed across all conditions on a slice-by-slice basis for all voxels in the STS. Twenty-eight (2 hemispheres × 14 image slices) waveforms are presented in Figure 2 (top). The horizontal axis shows the distance in 5 mm bins posterior from the AC. Within each bin, increasing time is displayed from left to right (-2-14 sec). Positive HDRs occurred 4-6 sec after stimulus onset (0 sec) at each slice. In the right hemisphere (red lines), we identified significant positive HDRs in the posterior STS (40-65 mm). In the left hemisphere (blue lines), positive HDRs were observed only in the posterior slices 55-65 mm from the AC. In half of the slices with substantial motion-evoked activity, HDRs were of larger amplitude in the right hemisphere than in the left hemisphere. The largest positive HDRs were observed in the right hemisphere 45-55 mm from the AC. Notably, HDRs dropped below baseline in the left hemisphere STS for most slices (0-50 mm), and we observed negative-going HDRs in the anterior half of the right hemisphere STS.
As expected, percentages of activated voxels followed patterns of distribution similar to those observed for the magnitudes of responses in the HDR waveforms (Fig. 2, bottom). Overall, a greater percentage of voxels was activated in the right hemisphere (mean, 19%; SE, 2%) than in the left hemisphere (mean, 2%; SE, 0.7%) of the STS across experimental conditions, indicating right hemisphere laterality for motion processing in the STS (t(12) = 6.32; p < 0.05; two-tailed). This effect did not differ by experimental condition.
We calculated the peak amplitude scores for each subject by averaging the 6 and 8 sec time points across the 14 slices of the right hemisphere STS. Using these measurements, we tested the hierarchy of questions described previously with three paired-sample t tests (Fig. 3). First, we compared responses to human [mean, 0.89 (SE, 0.33)] and robot [mean, 0.89 (SE, 0.29)], and equivalent responses under these two conditions indicated that the STS was responding to the biological motion conveyed by the figure, not the form of the figure (t(12) = 0.006; p > 0.995). Having established that robot and human evoked similar responses from the STS, we used the robot as a representative of biological motion and evaluated whether the STS region responded more strongly to biological motion (robot) than to a nonmeaningful but complex nonbiological motion (mechanical) or a coherent complex meaningful nonbiological motion (clock). The STS responded more strongly to robot than to clock [mean, 0.07 (SE, 0.32)], (t(12) = 1.78; p < 0.05; one-tailed) or mechanical [mean, 0.16 (SE, 0.38)], (t(12) = 2.90; p < 0.05; two-tailed).
Voxel-based analyses
Waveforms from STS voxels activated by motion
We conducted waveform analyses using the subset of STS voxels identified previously as positively activated to any one of the four stimulus conditions (i.e., the union of all activated voxels within the STS ROIs). Because of the strong laterality of motion processing observed in the previous analyses, this waveform analysis was performed separately for the two hemispheres.
First, we determined whether there was differential activity to biological and nonbiological. Both elicited significant responses from the selected voxels in the right STS. However, the response to biological was greater than that to nonbiological at 6 sec (t(12) = 2.54; p < 0.05; one-tailed) and 8 sec (t(12) = 2.48; p < 0.05; one-tailed). Next, we reasoned that if voxels in the STS respond to biological motion per se, then the HDRs elicited by robot and human should be very similar. The two waveforms did not differ (Fig. 4, top panel). As shown in the second panel of Figure 4, the response to robot was greater than was the response to mechanical at 6 sec (t(12) = 1.85; p < 0.05; one-tailed) and 8 sec (t(12) = 1.83; p < 0.05; one-tailed). Finally, the response to robot was greater than was the response to clock at 6 sec (t(12) = 2.46; p < 0.05; one-tailed) and 8 sec (t(12) = 1.91; p < 0.05; one-tailed) (Fig. 4, third panel).
We obtained a different pattern of effects from a parallel interrogation of those left hemisphere STS voxels activated by motion. Here, the HDR to biological was greater in peak amplitude than the response to nonbiological at each time point ≥4 sec after stimulus onset. However, this effect was driven by a particularly strong positive response to human coupled with a below baseline dip in the HDR elicited by clock at the final two time points (Fig. 4, bottom). Moreover, in contrast to the right hemisphere STS, robot, mechanical, and clock evoked roughly equivalent HDRs in the left hemisphere.
Deactivations differentiate biological and nonbiological motion in the STS
Our analyses to this point established that the posterior right hemisphere STS responded robustly to motion stimuli, and demonstrated that the STS responded overall more strongly to biological motion than to nonbiological motion. Nevertheless, inspection of the waveforms evoked by motion (Fig. 2, top) suggested significant negative-going activity in most of the left hemisphere STS and in the anterior half of the right hemisphere STS. These deactivations might be involved in distinguishing biological and nonbiological motion. To address this possibility, we identified voxels in the left hemisphere and right hemisphere STS that were significantly deactivated by any one of the four categories of motion and examined whether the negative activity from these voxels differed by stimulus condition. Deactivated voxels were defined as those that displayed negative-going HDRs that correlated above threshold (t < -1.96) with the inverse of the reference waveform.
Equal percentages of deactivated voxels were observed in the left hemisphere [mean, 6.44% (SE, 1.39%)] and right hemisphere [mean, 7.03 (SE, 1.63)] STS. In the right hemisphere STS, more voxels were deactivated in response to nonbiological motion [mean, 8.29% (SE, 1.37%)] than to biological [mean, 5.53% (SE, 1.95%)] (t(12) = 3.14; p < 0.01; two-tailed). The waveforms from the deactivated voxels from both hemispheres differentiated the conditions just as the waveforms from activated voxels did. The magnitude of the negative-going HDR was greater for nonbiological compared with biological at 6 sec (t(12) = 2.41; p < 0.05; two-tailed) (Fig. 5). This effect did not differ by hemisphere.
Activations outside of the STS
Areas of significant motion-evoked activity in addition to the STS were identified in a voxel-by-voxel random effects analysis of the group-averaged and spatially normalized data. As shown in Figure 6, an area posterior and inferior to the STS region of activation, probably corresponding to area MT/V5, activated to all categories of movement. The location of the region of MT/V5 activation in the present study corresponds closely to those reported in other studies of nonbiological motion (Zeki et al., 1991; McCarthy et al., 1995). In contrast to the STS, MT/V5 activated most strongly to mechanical and responded equivalently to the other conditions. Other significant motion-related activations were localized to (1) the right premotor cortex, (2) the intraparietal sulci bilaterally, and (3) the fusiform gyri bilaterally. Activity within these regions did not differentiate the stimulus conditions.
Discussion
The present study extends previous reports of activation in the STS region to observation of whole-body biological motion (for review, see Decety and Grèzes, 1999; Allison et al., 2000). We observed robust activity to both biological and nonbiological motion in the STS, and this activity was greatest within the crux of the right STS, at the point where the STS bifurcates into the straight segment and the ascending limbs. Motion-related activity in the STS was decidedly right lateralized. Examination of the stimulus-sorted time epochs after stimulus revealed that both biological and nonbiological motion activated the same regions, but that biological motion evoked larger HDRs than nonbiological motion in the STS. Moreover, the anterior-to-posterior distributions of activity were different for the left and right hemispheres, with areas of greatest activity in the right localized anterior to those same areas in the left. Indeed, in the two right hemisphere slices (45-50 mm posterior from the AC) that showed the strongest positive response to motion, there were strong negative-going responses in the left hemisphere to the same stimuli. We currently have no explanation for this interesting observation. Strong activation was also observed in a region corresponding to the motion area MT/V5, but this area did not show any preference for the biological motion stimuli used here. Notably, a dissociation between this region and the STS was observed such that this region responded more strongly to mechanical than to robot, a pattern opposite that observed in the STS.
We (Puce et al., 1998; Allison et al., 2000; Pelphrey et al., 2003; Wright et al., 2003) and other groups (Bonda et al., 1996; Howard et al., 1996; Calvert et al., 1997; Grèzes et al., 2001; Grossman and Blake, 2001, 2002; Vaina et al., 2001) have consistently reported that lateral temporal-parietal activity, particularly near the STS, is evoked by biological motion. These studies have provided important data but have typically used only a single stimulus category, and there has been little or no control for complex and meaningful nonbiological motion. Therefore, it has not been established whether the STS was preferentially engaged by biological motion or could be activated by other complex coordinated meaningful motions. Our findings are noteworthy because we compared biological motion (walking) with the complex but nonbiological motion of the grandfather clock and found that the STS responded more strongly to biological motion. We selected the grandfather clock because it shared several anthropomorphic features with the human and because it was an easily recognized mechanical device. We also compared biological motion with a more typical control condition involving a meaningless yet complex nonbiological motion (that of the mechanical figure), and again confirmed greater STS activity to biological compared with nonbiological motion. Finally, through our observation of equivalent STS activity to the human and the robot, which differed in form but not motion, we determined that the STS is sensitive to biological motion itself, not merely to the surface features of the stimulus.
The present findings show that the STS region is sensitive to the distinction between biological and nonbiological motion, but we cannot conclude on this basis alone that this is the primary organizing principle in this region. The STS could be organized around other dimensions that are typically confounded with biological motion (e.g., whether the motion is intentional, goal directed, or signals the approach or avoidance of the moving object relative to the observer). Some of these issues have been investigated within the context of biological motion, and the results indicate that the STS is sensitive to the context in which the motion occurs (Pelphrey et al., 2003; Wright et al., 2003). For example, the perception by an observer of a gaze shift that acquires a target in the visual field activates the STS differently than does the same gaze shift to a location in empty space. The STS is also activated when individuals make complex social judgments about socially relevant stimuli (Winston et al., 2002) or when subjects attribute intentionality to self-propelled animate entities (Castelli et al., 2000; Blakemore et al., 2001). Thus, the pattern of amplitude differences observed in this study is equally consistent with an emerging understanding of the importance of the STS as one component of a larger system involved in interpreting the emotional and social valence of motion (Allison et al., 2000; Adolphs, 2003).
One unexpected finding in the present study was the presence of voxels evincing negative-going HDRs that were more numerous and larger in amplitude for nonbiological than biological. Although the meaning of negative BOLD responses is as yet unclear, several studies have suggested that they may represent a decrease in neuronal firing, or deactivations (Gusnard and Marcus, 2001). When the STS was considered as a whole, the differential spatial distribution and amplitude of the deactivations strongly contributed to the overall amplitude differences observed between biological and nonbiological. Positive activations in the posterior STS have typically been the focus of fMRI studies of biological motion perception, and most previous studies have only reported difference activations; thus neither the activations nor deactivations evoked by nonbiological control stimuli have been described. The waveforms from deactivated STS voxels differentiated conditions in a manner strikingly similar (but opposite) to the pattern observed from activated voxels. In particular, the negative response to nonbiological motion was greater in amplitude than the negative response to biological motion. This suggests that deactivations in the STS carry significant information about biological motion and other socially relevant stimuli in addition to that conveyed by regions of activation.
The pattern of deactivations observed in the current study are consistent with findings from an fMRI study by Mitchell et al. (2002), in which they compared activity with judgments about words describing people or objects. They found relatively little change from baseline in brain regions including the STS for people judgments, but identified significant deactivations in the STS and other regions for object judgments. Adolphs (2003) commented that findings such as these might indicate that this region of the baseline activity of the brain reflects a mode of operation tuned to processing social information. Therefore, relatively high baseline activity increases slightly when social stimuli are presented and decreases significantly in the presence of nonsocial stimuli. Future work will likely offer new insights into the mechanisms of social information processing by examining the conditions under which socially relevant stimuli activate or deactivate portions of the STS.
Footnotes
This research was supported by the Department of Veterans Affairs and National Institutes of Health Grant MH-05286. K.A.P. was supported by National Institute of Child Health and Human Development Grant 1-T32-HD40127. G.M. was supported by a Career Research Scientist Award from the Department of Veterans Affairs. We thank R. Viola, B. Mack, and A. Song for assistance with several aspects of this research. We thank Dr. Gary Glover for providing source code for the spiral pulse sequence.
These results were reported in preliminary form at the 10th Annual Cognitive Neuroscience Meeting, New York, NY.
Correspondence should be addressed to Dr. Gregory McCarthy, Duke-UNC Brain Imaging and Analysis Center, 163 Bell Building, Box 3918, Durham, NC 27710. E-mail: gregory.mccarthy{at}duke.edu.
T. V. Mitchell's present address: Eunice Kennedy Shriver Center, Waltham, MA 01655.
Copyright © 2003 Society for Neuroscience 0270-6474/03/236819-07$15.00/0