When a child reaches toward a cookie, the watching parent knows immediately what the child wants. The neural basis of this ability to interpret other people’s actions in terms of their goals has been the subject of much speculation. Research with infants has shown that 6 month olds respond when they see an adult reach to a novel goal but habituate when an adult reaches to the same goal repeatedly. We used a similar approach in an event-related functional magnetic resonance imaging experiment. Adult participants observed a series of movies depicting goal-directed actions, with the sequence controlled so that some goals were novel and others repeated relative to the previous movie. Repeated presentation of the same goal caused a suppression of the blood oxygen level-dependent response in two regions of the left intraparietal sulcus. These regions were not sensitive to the trajectory taken by the actor’s hand. This result demonstrates that the anterior intraparietal sulcus represents the goal of an observed action.
On seeing another person act, we automatically interpret the elemental movements in terms of the actor’s goals, intentions, desires, and beliefs. In particular, goals are central to action planning and to our interpretation of other people’s actions. There is extensive evidence that adults encode actions in terms of their outcomes (Hommel et al., 2001). Preschoolers (Bekkering et al., 2000) and even 6-month-old infants are able to detect and respond to other people’s goals (Woodward, 1998). Thus, goals may provide a fundamental unit for action representation, but the neural basis of goal in the human brain has not been localized.
We present here an investigation of the representation of immediate goals. Such goals are characterized by the conjunction of a particular object with a particular action sequence, for example, reaching, grasping, and taking a cookie. In a hierarchical system of action representation (Keele et al., 1990), immediate goals can be placed above elemental actions such as reaching or grasping but below task goals such as preparing a snack (Fig. 1). Because the same action elements can contribute to different immediate goals, interpreting a goal goes beyond recognizing an observed movement pattern. It involves understanding the actor’s desire in reaching for the object and is a step toward recognizing the actor as an intentional agent. We note that intentions have features in common with goals, but intentionality is more general and has also been applied to the evaluation of unpredicted actions (Pelphrey et al., 2004; Saxe et al., 2004) or actions in context (Iacoboni et al., 2005).
There are few previous investigations of the neural basis of goals in adult humans, but goal detection has been established in infants. Woodward (1998) showed 6-month-old infants scenes of an adult reaching toward one of two objects in a habituation paradigm. She found that the infants dishabituate when an adult reaches toward a new goal object but not when the adult moves along a new movement path to the old goal object, demonstrating that infants encode adult’s goals. Habituation in infants is typically measured as a decrease in looking time when the same stimulus is presented repeatedly. In the monkey brain, repetition of the same stimulus leads to decreased neuronal firing (Desimone, 1996), and in adult humans, repetition of a stimulus often results in a reduction in blood oxygen level-dependent (BOLD) signal in brain areas that encode that stimulus, as measured by functional magnetic resonance imaging (fMRI). This phenomenon is termed repetition suppression (RS) and has been shown for a wide range of domains including number (Naccache and Dehaene, 2001), objects (Grill-Spector and Malach, 2001), and semantics (Thompson-Schill et al., 1999). By measuring RS in the adult brain during an fMRI version of Woodward’s paradigm, we can localize the neural basis of goal.
Materials and Methods
We used repetition suppression in an event-related fMRI experiment to localize the neural representation of immediate goals. Twenty participants gave their informed consent to take part in the study in accordance with the requirements of the local ethics board. Nine were male, the mean age was 24.7 years, and 19 were right handed (one was ambidextrous) according to the Oldfield handedness inventory. In the scanner, participants viewed sequences of movies, each 2.5 s long, separated by a blank screen for 0.7 s (Fig. 1) and were instructed simply to watch the movies. Each movie depicted an actress’s hand reaching, grasping, and taking one of two objects. After a sequence of nine movies, participants either answered a yes–no question about the last movie or rested for 6 s. The question tested participants’ knowledge of any aspect of the movie, for example, “Did she move to the left?” or “Did she take a tool?” During a sequence, the upcoming question could not be predicted, so participants were required to monitor everything they saw to answer correctly, and 91% of responses were correct.
Each sequence of nine movies began with a randomly chosen movie designated as new. The subsequent movies were chosen according to a two-by-two factorial design with factors goal and trajectory, each with two levels, novel and repeated (Fig. 2). Novel and repeated trials were defined in relation to the previous trial only, which meant that each individual movie appeared in every condition. For example, items 1, 2, and 5 in the sequence shown in Figure 2 are the same movie but contribute to different conditions. This means that all the conditions were perfectly balanced for all visual properties. Each participant completed six runs with 10 sequences in each run, giving a total of 120 trials in each condition. Different tool–food object pairs were used in each run, and each pair was chosen to have a similar shape and thus to elicit a similar grasp from the actress but to have very different semantic and intentional associations.
Trial order was pseudo-randomized by specifying the probability of a novel goal and a novel trajectory over successive trials with two nonharmonic sine functions scaled to range from p = 0.2 to p = 0.8. Before scanning, trial sequences and their associated design matrices were generated and tested for the efficiency of the goal contrast (Henson, 2004), and sequences with low efficiency or very high efficiency (>2 SDs above mean efficiency) were rejected. This step ensures adequate power in the design matrix without introducing noticeable blocks of one condition.
The experiment was performed in a 1.5 T GE scanner using a standard birdcage head coil. Scanner parameters were: 25 slices per repetition time (TR; 4.5 mm thickness, 1 mm gap), with a TR of 2500 ms, echo time (TE) of 35 ms, a flip angle of 90°, a field of view of 24 cm, and a matrix 64 × 64. The first four volumes of each functional run were discarded to allow magnetization to approach equilibrium, then an additional 145 whole-brain images were collected in each run. After all the functional runs, a high-resolution T1-weighted image of the entire brain was acquired using a spoiled gradient recalled three-dimensional sequence (TR, 7.7 ms; TE, 6–4 ms; flip angle, 15°; field of view, 24 cm; slice thickness, 1.2 mm; matrix, 256 × 192).
Data were realigned and unwarped in SPM2 and normalized to the Montreal Neurological Institute (MNI) template with a resolution of 2 × 2 × 2 mm. A design matrix was fitted for each subject with the movies in each cell of the two-by-two factorial design modeled by a standard hemodynamic response function (HRF) and its temporal derivative and dispersion derivative. New movies and questions were modeled in the same way but not analyzed further. The design matrix weighted each raw image according to its overall variability to reduce the impact of movement artifacts (Diedrichsen and Shadmehr, 2005). After estimation, 9 mm smoothing was applied to the beta images.
We predicted that brain regions that represent another person’s immediate goal should show a greater response to novel goals than to repeated goals but should not distinguish trajectories. Based on previous work, we considered three likely regions candidates for a goal representation. First, the inferior frontal gyrus (IFG) is activated by both action observation and imitation (Iacoboni et al., 1999; Rizzolatti and Craighero, 2004). Neurons in this area of the macaque brain respond when an action is inferred to take place behind a screen (Umilta et al., 2001), and some studies have linked this region to intentionality (Iacoboni et al., 2005). Second, the intraparietal sulcus (IPS) is activated in action planning, execution, and observation tasks (Grezes and Decety, 2001; Frey et al., 2005). In particular, recent work shows that transcranial magnetic stimulation over IPS disrupts the formation of a new grasp to a new goal (Tunik et al., 2005). These frontal and parietal regions together comprise the human mirror neuron system for action representation (Rizzolatti and Craighero, 2004). Third, the right superior temporal sulcus (STS) has been associated with the observation of actions (Jellema et al., 2000), in particular unexpected actions (Pelphrey et al., 2004; Saxe et al., 2004), and is a possible locus for a goal representation.
To test for goal within these regions, we created a region of interest (ROI) mask that included the right posterior superior temporal sulcus [coordinates from Pelphrey et al. (2004) and Saxe et al. (2004)], the left and right BA44 (Amunts et al., 1999), and the left intraparietal sulcus (defined anatomically from the high-resolution scans of the participants). Within the mask, a t test on the HRF contrast for the main effect of goal (novel > repeated) was performed, and activations at p < 0.001 and 10 voxels uncorrected are reported. We also took a global approach and used an F test over the HRF, temporal derivative, and dispersion derivative over the entire brain to locate any regions showing reliable effects of goal at p < 0.001 and 10 voxels uncorrected. Similarly, we tested for RS related to the observation of repeated hand trajectories, using a t test on the HRF contrast within the ROI and using an F test over the entire brain. In both cases, we report clusters that were significant at p < 0.001 and 10 voxels uncorrected.
We found evidence for repetition suppression for goal in just two cortical regions, both in the left IPS (Fig. 3, Table 1). Both clusters showed a typical HRF to the presentation of the video stimuli, and more importantly, HRF magnitude was reduced when a second video clip with the same immediate goal was presented, regardless of the trajectory taken by the hand. This pattern of response is characteristic of RS and provides clear evidence that this region is specifically sensitive to goals. The more anterior IPS cluster survived a correction for multiple comparisons at p < 0.05 (Fig. 3A).
If statistical thresholds within the ROI were lowered to p < 0.05 uncorrected, RS for goal was found in both the left and right IFG but not in the STS. However, at such a liberal threshold, we do not consider the IFG result to be robust.
There was no evidence for RS related to the observed trajectory within the ROI mask. Over the entire brain, RS for trajectory was found in left lateral occipital sulcus and right superior precentral sulcus (supplemental Fig. 1A, B, available at www.jneurosci.org as supplemental material). These regions both showed a typical HRF response to the video clips, which was suppressed when the same hand trajectory was repeated, regardless of goal. Other regions showing effects in the F test for the observed motion trajectory are listed in supplemental Table 1 (available at www.jneurosci.org as supplemental material). An analysis of interactions between trajectory and goal did not reveal any activations.
Our results indicate that repeated observation of an action directed toward the same goal results in a systematic reduction of activation in the left intraparietal sulcus. In contrast, repeated observation of the same hand trajectory did not cause suppression within the regions of interest. These data have two important implications. First, we have demonstrated that it is possible to obtain RS effects for action representation tasks, thus leading to functional localizations. This opens up the possibility of using RS to study a much wider range of functions than examined previously. Second, we show that two loci in the left IPS are the only cortical regions to show robust repetition suppression for immediate hand action goals.
Implications of repetition suppression
Repetition suppression has not been used previously to study motor representations, and the interpretation of RS differs from the interpretation of traditional subtraction and interaction fMRI studies. We base our interpretation of RS in fMRI on previous models (Grill-Spector and Malach, 2001; Naccache and Dehaene, 2001), which rely on two basic assumptions. The first is that the fMRI signal reflects the activity of populations of neurons, and this activity encodes information in a distributed population code (Georgopoulos et al., 1992; Shadlen and Newsome, 1994). The second is that neuronal firing tends to be attenuated when a stimulus is presented repeatedly. This has been shown in monkey temporal and frontal cortex, where activity in single neurons is often reduced the second time a stimulus is presented (Miller et al., 1991; Lueschow et al., 1994).
Applying these two assumptions to the representation of goal, we might posit a population code in which one subpopulation responds preferentially to the observation of one type of goal, for example a “take-cookie” goal, whereas another subpopulation would respond to the observation of a different goal, for example a “take-disk” goal. In the current experiment, when an actress reaching for a cookie is observed, the take-cookie subpopulation will respond, and a BOLD signal will be recorded. When the second video clip again shows the actress reaching for the cookie (even in a different location), the take-cookie subpopulation has now habituated and will respond less vigorously, leading to repetition suppression in the BOLD signal. If the third video in the sequence shows the actress reaching for the disk, the habituated take-cookie neurons do not fire, but the fresh take-disk subpopulation will respond, and a robust BOLD signal will be recorded. This differential response can occur even if the same total number of neurons is activated for each possible goal, as long as different goals are represented by different subpopulations. Thus, the presence of RS is able to characterize the population coding within a brain region, and in the current experiment, it reveals coding for goal in the IPS.
Two potential difficulties with these results must be addressed. First, the different goal objects might have elicited different grasps from the actress, in which case the RS is linked to grip aperture rather than goal. However, we chose a variety of objects as goals, matching size and shape while distinguishing function. Examination of the videos shows that changes in hand configuration between gripping an object with trajectories from the left or right are at least as large as changes between objects. Thus, for our study, RS in the IPS is specific to the interaction of the hand with the object, that is, the goal.
Second, previous studies have linked RS to behavioral priming (Maccotta and Buckner, 2004; Wig et al., 2005). We have no measure of behavioral priming here, because reaction time tasks are not compatible with the presentation of an action in video. This is a limitation in that we cannot draw conclusions about the behavioral priming of goal. It is also an advantage because subjects were free to attend to the whole of each movie, rather than focus on one component of it. Movies were sequenced such that every movie played the role of both “prime” and “target, ” and no subject was aware of the sequence manipulation. This ensures that shifts in attention or cognitive strategy between movies cannot be responsible for the observed RS. Instead, we suggest that RS in the IPS reflects changes in neuronal firing a population that encodes the goals of other people’s actions.
The representation of goal
The representation of immediate goal was found in two regions of the lateral bank of IPS, within the inferior parietal lobe (IPL). Previous work shows that the IPL and IPS are activated by the observation of hand actions (Grezes and Decety, 2001) and that damage to parietal regions impairs the ability to interpret actions (Rothi et al., 1985). An IPS site (−24, −53, 58) adjacent to our sites also responds to the observation of intentional hand actions (Pelphrey et al., 2004), although unlike the STS site reported in the same paper, the parietal region was not modulated by the correctness of the observed action. The inferior parietal cortex is considered part of the human mirror system (Rizzolatti and Craighero, 2004), and recent data demonstrate object coding in the IPS, possibly related to a goal representation (Shmuelof and Zohary, 2005).
The more anterior IPS cluster overlaps a region associated with grasping in humans (Frey et al., 2005), and disruption of this region using transcranial magnetic stimulation delays the correction of grasp to conform to a new goal (Tunik et al., 2005). This work suggests that during actions, IPS maintains a representation of the current goal to correct for errors. Similarly, RS for observed goal may reflect the maintenance of a goal representation from one trial to the next. In monkey parietal cortex, neurons have been recorded that signal the monkey’s decision to move (Platt and Glimcher, 1999) and intention to move (Andersen and Buneo, 2002). More recently, single cells in macaque IPL were shown to respond selectively to both the performance and observation of an action within a sequence leading to a specific goal and not to the same action when it was part of a sequence achieving a different goal (Fogassi et al., 2005). These results all suggest that parietal cortex is a critical region for the representation of action plans and goals. We have now demonstrated that, in humans, the IPS is uniquely sensitive to the goals of other people’s hand actions.
There is not yet sufficient data to demonstrate whether the parietal goal representation we demonstrate in adults is also present in 6-month-old infants and is responsible for the habituation results reported by Woodward (1998). However, functional imaging in 2- to 3-month-old infants listening to speech has demonstrated left lateralized activations similar to those found in adults (Dehaene-Lambertz et al., 2002). This suggests that at least some functional localizations are comparable between the infant brain and the adult. In the more specific case of habituation, the neural basis is not known. It is possible that infant habituation to repeated goals results from a decrease in parietal activity similar to the RS for goal that we have observed in adults, but it is also possible that habituation arises from a more general, unlocalized attentional mechanism. This question can only be resolved by studying neural responses in infants, and we suggest that our approach, of intermixing novel and repeated stimuli rather than performing a single habituation sequence, might provide a useful method.
In contrast to the robust RS for goal in the IPS, we found very weak goal-related RS in the IFG and none in the STS. This might indicate a limitation of the RS method; it is possible that some brain regions are not subject to suppression when stimuli are repeated. However, RS has been found in a wide range of brain areas including the IFG (Thompson-Schill et al., 1999), and only primary sensory cortices seem to be immune from RS (Buckner et al., 1998). Thus, it seems unlikely that we failed to detect a goal representation in the IFG or STS because of a lack of RS in these regions. RS is specific to the stimulus parameters encoded in an area, so it is more likely that the IFG and STS do not encode immediate goals as tested in the current experiment. These regions might encode goal at a different level of representation. For example, the IFG seems to respond to goal-directed actions in context (Iacoboni et al., 2005) and is particularly concerned with imitating actions (Iacoboni et al., 1999; Buccino et al., 2004), rather than pure observation of actions. The STS responds to unexpected intentional acts signaled by whole-body motion (Saxe et al., 2004), eye movement (Pelphrey et al., 2003), and hand movement (Pelphrey et al., 2004).
It may be an accident of history that mirror neurons were first discovered in inferior frontal regions (Gallese et al., 1996) and that some of the early human fMRI studies emphasized this region (Iacoboni et al., 1999). This has lead some authors to consider only the IFG when studying the mirror system (Koski et al., 2002; Johnson-Frey et al., 2003), and in computational models, parietal areas have been described as merely a relay between the visual cortex and IFG (Keysers and Perrett, 2004). However, the current evidence defines an important and unique function for the parietal cortex in action understanding. Our results complement recent monkey studies of decoding intentionality (Fogassi et al., 2005) and suggest that the IPS is not just a relay but has a central role in representing and interpreting the goals of observed hand actions.
This work was funded by the McDonnell Foundation. We thank Tammy Laroche for help with scanning and Uta Frith and Bill Kelly for comments on this manuscript.
- Correspondence should be addressed to Dr. Antonia F. de C. Hamilton at the above address. Email: