Several regions of the posterior-lateral-temporal cortex (PLTC) are reliably recruited when participants read or listen to action verbs, relative to other word and nonword types. This PLTC activation is generally interpreted as reflecting the retrieval of visual-motion features of actions. This interpretation supports the broader theory, that concepts are comprised of sensory–motor features. We investigated an alternative interpretation of the same activations: PLTC activity for action verbs reflects the retrieval of modality-independent representations of event concepts, or the grammatical types associated with them, i.e., verbs. During a functional magnetic resonance imaging scan, participants made semantic-relatedness judgments on word pairs varying in amount of visual-motion information. Replicating previous results, several PLTC regions showed higher responses to words that describe actions versus objects. However, we found that these PLTC regions did not overlap with visual-motion regions. Moreover, their response was higher for verbs than nouns, regardless of visual-motion features. For example, the response of the PLTC is equally high to action verbs (e.g., to run) and mental verbs (e.g., to think), and equally low to animal nouns (e.g., the cat) and inanimate natural kind nouns (e.g., the rock). Thus, PLTC activity for action verbs might reflect the retrieval of event concepts, or the grammatical information associated with verbs. We conclude that concepts are abstracted away from sensory–motor experience and organized according to conceptual properties.
Concepts are mental representations that form the meanings of words and allow us to categorize events and entities in the world (Medin and Smith, 1984). During development, and throughout the course of our lives, perceptual and action experiences allow us to form concepts. What role does perceptual experience play during subsequent concept retrieval (e.g., during word comprehension)?
According to one hypothesis, concept retrieval is the reactivation of experiences stored in the sensory–motor cortices (Allport, 1985; Martin et al., 1996; Pulvermüller, 1999, 2001, 2002; Prinz, 2002; Gallese and Lakoff, 2005). For example, retrieving the concept “kick” reactivates the representation of our experiences of kicking, including seeing someone else kick or planning and executing kicking ourselves. Alternatively, concepts might be abstracted away from sensory–motor experiences. According to this proposal, concepts are represented outside of sensory–motor cortices and organized by conceptual, rather than perceptual, properties (Caramazza et al., 1990; Rogers et al., 2004; Mahon and Caramazza, 2008).
Support for the idea that sensory–motor experiences are replayed during conceptual retrieval comes from neuroimaging studies of action verbs. Comprehension of action verbs leads to increased activity in the posterior-lateral-temporal cortices (PLTC), when compared with comprehension of names of objects or nonwords (Martin et al., 1995; Damasio et al., 2001; Kable et al., 2002, 2005; Davis et al., 2004; Bedny and Thompson-Schill, 2006). This PLTC activation is said to occur in, or near, brain regions that process visual motion, specifically middle temporal area (MT+), which subserves basic motion processing, and the right superior temporal sulcus (rSTS), which is important for biological motion perception (Watson et al., 1993; Grossman et al., 2000, 2005; Giese and Poggio, 2003). Understanding action verbs is therefore thought to activate visual-motion representations (Martin et al., 1995; Damasio et al., 2001).
Alternatively, PLTC activity for action verbs may lie outside of motion perception areas and reflect the retrieval of nonsensory, conceptual or grammatical information relevant to action verbs. Consistent with this interpretation, regions within the PLTC respond not only to action verbs, but also to abstract verbs (Grossman et al., 2002; Davis et al., 2004; Bedny and Thompson-Schill, 2006; Shapiro et al., 2006). However, to date the grammatical class and visual-motion features of words have not been manipulated within the same study. Nor has it been established whether action-verb activation occurs within or outside of motion-perception regions. It is therefore not known whether PLTC activity for action verbs is a function of visual-motion features, grammatical category, or both.
In the present study, we tested the prediction that understanding action verbs activates motion-perception regions. We then determined whether the response of PLTC regions during word comprehension is predicted by the presence or absence of visual-motion features, or by a word's grammatical class. PLTC activity was measured during motion perception and while subjects made semantic judgments about verbs and nouns that varied as to whether their meanings contained visual-motion features.
Materials and Methods
Twelve healthy native-English speakers (six females) took part in the word comprehension experiment. The average age of the participants was 24 (SD, 3). Eleven of these participants took part in the localizer experiments. Participants were all right-handed, native English speakers. None suffered from psychiatric or neurological disorders, or had ever sustained head injury, or were on any psychoactive medications. All subjects gave informed consent and were paid $30 per hour for taking part in the experiment.
Participants took part in three experiments during a single scan session: a word comprehension experiment and two “localizer” experiments (biological motion and basic motion).
In the word comprehension experiment (five runs of 7.7 min) participants heard pairs of words presented over headphones. Participants indicated how related in meaning the two words were on a scale of one to four by pressing buttons one through four on a respond pad. Word pairs were presented in blocks of five and were blocked by condition. Blocks were 18 s long and were separated by 14 s of fixation.
Word stimuli consisted of 50 words in each of the following categories: high-motion verbs (action); intermediate-motion verbs (change-of-state and bodily function); low-motion verbs (mental); high-motion nouns (animals); intermediate-motion nouns (tools); and low-motion nouns (inanimate natural kinds). We obtained motion ratings for all words in a separate set of 14 participants who rated words on the extent to which their meanings brought to mind visual movement (for instructions, supplemental text, and for results of the motion rating experiment, see supplemental Fig. 1, available at www.jneurosci.org as supplemental material). Nouns and verbs, as well as the different motion categories, were matched on familiarity (based on a prior rating study with a separate set of subjects), and were also matched by length in syllables and phonemes, as well as frequency (based on the MRC database) (familiarity: Mnoun = 4.72, SD = 0.75, Mverb = 4.58, SD = 1.31; syllable Number: Mnoun = 1.31, SD = 0.77, Mverb = 1.27, SD = 0.48; all p values > 0.10) (Coltheart, 1981) (for further details, see supplemental Table 3, available at www.jneurosci.org as supplemental material). A nonword condition was also included, but not analyzed for the present study. Stimuli were digitally recorded by a male native English speaker at a sampling rate of 44,100 to produce 32-bit digital sound files. Audio files were normalized to each other in volume with respect to root mean square (RMS) amplitude such that all files, and consequently, all categories, had approximately equal RMS (average RMS, −12.04 dBFS) (http://normalize.nongnu.org/README.html). Words were presented in pairs, with 50 pairs per category. Each word was repeated once during the experiment, but paired with a different word the second time.
In the biological motion localizer experiment, participants performed a one-back task with point-light animations of human hand, leg and whole-body actions, such as kicking, running and jumping (animation duration, 1 s). The depicted actions were similar to the meanings of action verbs in the word comprehension experiment. The control condition consisted of scrambled point-light animations, which are perceived as meaningless dot movements rather than human actions (for further details, see Grossman et al., 2000). The biological motion and control conditions were blocked (block length, 18 s; between block interval, 12 s). The experiment consisted of two runs, each 5.17 min long.
During the basic motion localizer experiment, participants saw four types of blocks. Motion blocks consisted of contracting and expanding concentric rings. In the motion control condition, the same concentric rings changed in luminance, but did not move (Tootell et al., 1995; Ahlfors et al., 1999). Two other conditions were included: bodies (still photographs of bodies and body parts) and objects (photographs of manmade objects and object parts). These conditions allowed us to localize the extrastriate body area (EBA) and the lateral-occipital complex (LOC). Throughout the experiment, participants were instructed to fixate on the center of the screen (for details of procedure, see Saxe et al., 2006).
Functional magnetic resonance imaging data acquisition and analysis.
Structural and functional data were collected on a 3 Tesla Siemens scanner at the Athinoula A. Martinos Imaging Center at the McGovern Institute for Brain Research at the Massachusetts Institute of Technology. T1-weighted structural images were collected in 128 axial slices with 1.33 mm isotropic voxels [repetition time (TR) = 2 ms; echo time (TE) = 3.39 ms]. Functional, blood oxygenation level-dependent (BOLD) data were acquired in 3 × 3 × 4 mm voxels (TR = 2 s; TE = 30 ms) in 30 near-axial slices. The first four seconds of each run were excluded to allow for steady-state magnetization.
Data analysis was performed using SPM2 (SPM2 http://www.fil.ion.ucl.ac.uk/) and in-house software. The data were realigned, smoothed with a 5 mm smoothing kernel, and normalized to a standard template in Montreal Neurological Institute space. The modified-linear model was used to analyze BOLD activity of each subject as a function of condition. Covariates of interest were convolved with a standard hemodynamic response function (HRF). Nuisance covariates included run effects, an intercept term, and global signal. Time-series data were subjected to a high-pass filter (128 Hz).
BOLD signal differences between conditions were evaluated through second level, random-effects analysis. In whole-brain analyses, the false-positive rate was controlled at α < 0.05 (corrected) by performing Monte Carlo permutation tests on the data (using a cluster size threshold with a primary threshold of 3) (Nichols and Holmes, 2002; Hayasaka and Nichols, 2004). Whole-brain analyses included 12 subjects for the word comprehension experiment, and 11 subjects for each of the two localizer experiments. The whole-brain, overlap analysis thus included 11 subjects. Region-of-interest (ROI) analyses were performed on the average of percentage signal change (PSC) from TR 3 through 10 relative to a rest baseline (the first two TRs were excluded to account for the hemodynamic lag) (for examples of similar analyses, see Saxe et al., 2006; Baker et al., 2007). Functional ROIs were identified in individual subjects based on localizer experiments or orthogonal contrasts. The one participant who did not take part in the localizer experiments did not have perceptual ROIs and was thus excluded from all analyses of those ROIs. For the purposes of defining ROIs, contrasts were thresholded in individual subjects at p < 0.001, k ≥ 10. If no voxels were observed at this threshold, the threshold was lowered to p < 0.01. If no voxels were observed at the lowered threshold, the subject was excluded from that analysis. The number of subjects included in each analysis is indicated throughout the results section. Analyses comparing high-motion to low-motion words, as well as intermediate-motion to low-motion words included 200 items. Comparisons of high- to low-motion categories for nouns and verbs separately (as well as all other pairwise comparisons) included 100 word pairs (50 in each category).
For verbs, actions had the highest motion ratings, followed by change of state/bodily function verbs and then by mental verbs. (Change of state and bodily function verbs were grouped into the intermediate motion category for verbs because they had similar motion ratings.) For the nouns, animals had the highest motion ratings, followed by tools, and then by natural kinds. Within grammatical classes, the main effect of motion category was reliable (Fitem > 25, p < 0.0001; Fsubject > 4, p < 0.05) (supplemental Fig. 1A, available at www.jneurosci.org as supplemental material). Based on these ratings, we defined two contrasts of high-motion and low-motion words for the functional magnetic resonance imaging analyses, collapsing across grammatical class. For the primary contrast, high-motion words included action verbs and animal nouns, whereas the low-motion words included mental verbs and inanimate natural kind nouns. The high-motion words were reliably higher in motion ratings than low-motion words (titem(198) = 13.84, p < 0.0001; tsubject(13) = 7.73, p < 0.0001) (Fig. 1A). We also defined a secondary motion contrast, comparing the intermediate and low-motion categories, which differed reliably in motion ratings (bodily function and change of state verbs + tool nouns > mental verbs + inanimate natural-kind nouns; titem(198) = 6.48, p < 0.0001; tsubject(13) = 3.59, p < 0.01). This secondary set was orthogonal to the action verbs − animal nouns contrast and was used to test hypotheses in brain regions defined by the action verbs − animal nouns contrast.
Based on the biological motion localizer experiment, we defined right and left STS (lSTS) ROIs in individual subjects, with average peak voxels [58 −49 13] and [−57 −55 12]. Based on the basic motion localizer experiment, we defined the following ROIs in individual subjects: right and left MT+ ([48 −66 2], [−46, −72, 3]), right and left EBA ([54 −62 5], [−53 −70 6]), and right and left LOC ([43 −78 −6], [−45 −74 −6]) (supplemental Table 1, available at www.jneurosci.org as supplemental material). Visual-motion regions identified in the whole-brain random effects analyses are shown in Figure 1 (supplemental Table 2, available at www.jneurosci.org as supplemental material).
Word comprehension experiment
Participants judged the semantic similarity of pairs of words. The average similarity was not different for nouns versus verbs (Mnoun = 1.86 ± 0.38, Mverb = 1.85 ± 0.34; t(8) = 0.28, p = 0.78). Furthermore, there was no difference in average similarity among categories of nouns (F(2,16) = 0.60, p = 0.56) or verbs (F(2,16) = 0.54, p = 0.60; all paired comparisons t < 1, p > 0.3). High-motion words (action verbs + animal nouns) did not differ in average pairwise similarity from low-motion words (mental verbs + inanimate natural kinds) (t < 1, p < 0.3).
Participants responded faster to noun pairs than verb pairs (Mnoun = 1702 ± 118 ms, Mverb = 1823 ± 87 ms; t(8) = 5.06, p = 0.001). Participants responded to all semantic categories of verbs equally fast, and the same was true of nouns (Fnoun(2,16) = 2.07, p > 0.15; Fverb(2,16) = 0.64, p > 0.5; all paired comparisons, t < 1.9, p ≥ 0.10). High-motion words did not differ in reaction time from low-motion words (t < 1, p < 0.3) (for relatedness ratings and reaction time of each category, see supplemental Table 3, available at www.jneurosci.org as supplemental material)
Response of motion perception regions during word comprehension
Do motion perception regions distinguish between motion words and nonmotion words?
Do the brain regions involved in visually perceiving motion also support understanding words that have visual-motion features? To answer this question, we examined activity during word comprehension in bilateral MT+ and STS. From each of the ROIs, we extracted the PSC relative to resting baseline for each word category. We then tested whether the response differentiated between the highest and lowest motion-word categories. We also assessed whether any of these regions differentiated between verbs and nouns.
PSC did not differ among word types within the right or left MT+ ROIs: high-motion words did not differ from low-motion words, nor did verbs differ from nouns. No verb or noun types differed from each other (F < 2, p > 0.2). In fact, the BOLD response in bilateral MT+ was reliably lower, during all word conditions, than during the rest condition (Mright MT = −0.35 ± 0.24, t(10) = −4.92, p < 0.0001; Mleft MT = −0.36 ± 0.24, t(10) = −4.96, p < 0.0001). This result is consistent with previous findings showing that perceiving stimuli in one modality (auditory, in this case) results in suppression of regions that are important for perception with a different modality (visual, in this case) (Haxby et al., 1994; Sadato et al., 1996) (Fig. 1B)
In the right STS, high-motion words (action verbs and animal nouns) did not differ from low-motion words (mental verbs and inanimate natural kind nouns), nor did verbs differ from nouns (t < 1, p > 0.3). The high-motion nouns did not differ from low-motion nouns, nor did the high-motion verbs differ from low-motion verbs. None of the verb and noun categories differed among themselves (F < 1, p > 0.3). In the left STS, high-motion words did not differ from low-motion words (t < −1, p > 0.25). However, verbs produced greater activity than nouns (Mverb = 0.18 ± 0.44, Mnoun = 0.03 ± 0.24, t(10) = 3.53, p < 0.001). This effect remained reliable when reaction time was included as a covariate in each subject's model (t(7) = 2.79, p < 0.05). None of the verb categories differed among themselves (F < 1, p > 0.3). Among the nouns, natural objects produced greater activity than animals (F(2,11) = 6.4, p < 0.01, Tukey's honest significant difference, Q = 2.53, p < 0.05) (Fig. 1B).
Do other perceptual regions, EBA, and LOC, differentiate among words based on their sensory features?
Although the primary goal of this study was to examine the relationship between motion perception and word comprehension, we also examined word-related activity in other higher visual regions: the EBA and the LOC. During perception, the LOC responds preferentially to pictures of objects, so we assessed whether LOC responded more to names of objects (i.e., nouns) than names of actions and events (i.e., verbs). The EBA responded to pictures of animal bodies more than to other objects (Downing et al., 2001). Therefore, we examined whether the EBA responded more to names of animals than names of other categories of objects. None of these effects approached significance (p > 0.3). Bilateral EBA and LOC were reliably deactivated compared with rest during word comprehension (t < −3.1, p < 0.05), and did not differentiate word categories. (All ROI analyses included 11 subjects, with the exception of the lSTS biological motion region, which was identified in 10 of 11 subjects).
Does the PLTC distinguish between high and low-motion words or verbs and nouns?
We conducted a whole-brain analysis to examine whether any region responded more to high-motion than to low-motion words (action verbs + animal nouns) > (mental verbs + inanimate natural-kind nouns)]. We did not find any region that was more active for high-motion than low-motion words at the corrected threshold (through permutation analysis). Nor did any region respond more to high- than low-motion verbs or nouns when these were compared separately. To rule out the possibility of a subthreshold effect, we used an extremely lenient threshold (p < 0.01 uncorrected, k = 10) and looked for the conjunction of action verbs > Thoughts verbs, and animals nouns > inanimate natural-kind nouns (the maximum contrasts of motion associations within each grammatical class) (Price and Friston, 1997; Friston et al., 2005). This conjunction analysis yielded no voxels in the PLTC.
Critically, the results of a whole-brain, random effects analyses replicated the previous finding of greater activity for action verbs than names of animals in the PLTC (supplemental Fig. 3, Table 2, available at www.jneurosci.org as supplemental material). Similarly, the verbs > nouns contrast replicated previous studies, revealing regions in PLTC, in addition to prefrontal and parietal regions (Fig. 2; supplemental Table 2, available at www.jneurosci.org as supplemental material). Both of the PLTC effects remained reliable when RT was included as a covariate in the model.
ROI analyses: do regions that respond to action concept in the PLTC respond preferentially to words with high-motion associations or to verbs more than nouns?
To examine whether the action verb > animal noun effect was attributable to the greater motion content of action verbs, action verb > animal noun regions were defined in individual subjects within bilateral PLTC. In the left PLTC, we identified an ROI close to the left temporoparietal junction (lTPJ) (10 of 12 subjects; average peak, −58 −48 22) and an ROI on the STS (12 of 12 subjects; −57 −41 −1). In the right hemisphere we identified one ROI on the STS (10 of 12 subjects; 57 −46 11).
We investigated whether these regions respond more to intermediate- than low-motion words [(tool nouns + bodily function and change-of-state verbs) > (inanimate natural-kind nouns + mental verbs)]. (As noted in the methods section, the intermediate-motion words were reliably higher in motion ratings than the low-motion words.) Alternatively, these regions might differentiate between action verbs and animals nouns based on grammatical class (bodily function and change-of-state verbs + mental verbs > inanimate natural-kind nouns + tool nouns). The grammatical class groupings did not differ in average motion ratings in the behavioral experiment (t(13) < 1, p > 0.3).
None of the left or right PLTC action-verb regions showed a motion effect either for verbs and nouns together, or separately (Fig. 1B) (t < 1, p > 0.5) (for an HRF graph, see supplemental Fig. 2, available at www.jneurosci.org as supplemental material) In contrast, both of the left PLTC regions showed greater activity for verbs than nouns (t > 5, p < 0.0001). The difference between verbs and nouns in the left PLTC regions remained reliable after RT was included as a covariate in the model (left STS, t(7) = 3.60, p < 0.01; left TPJ, t(7) = 7.28, p < 0.0001). The right STS region showed a trend for greater activity for verbs (t(9) = 1.78, p < 0.1).
ROI analyses: do any regions that respond more to verbs than nouns distinguish between high-motion and low-motion words?
We used the contrast verbs > nouns to identify regions of interest in individual subjects, and tested whether any of these regions differentiated between word categories based on motion associations. The verb > noun contrast revealed three function ROIs in the left PLTC of each subject: a region in the left TPJ, one in the posterior superior temporal gyrus (STG), and a third in the anterior STG (12 of 12 subjects for all ROIs). Two regions were identified in the right PLTC: a region in the posterior STG, and one in the posterior middle temporal gyrus (7 of 12 subjects both ROIs). In addition to the PLTC, we also examined ROIs in the left (11 of 12 subjects) and right (7 of 12 subjects) inferior frontal gyri and left (9 of 12 subjects) and right (8 of 12 subjects) inferior parietal lobule and left precentral gyrus (11 of 12 subjects). In these regions, we compared the highest motion categories to the lowest motion categories [(action verbs + animal nouns) > (mental verbs + inanimate natural-kind nouns)]. None of the verb-selective regions examined showed a greater response to high- than low-motion words, nor for high- than low-motion nouns or verbs separately (t values < 1, p > 0.3).
We also examined whether any region in the PLTC defined by verb > nouns also showed more activity for tools than animals (as reported by Kable et al., 2005). We replicated the finding of more activity for tools than animals in the posterior aspect of the left superior temporal gyrus. However, this effect could not be attributable to motion information associated with tool words (even tool specific motion information) because this region did not respond more to tools than inanimate natural kinds (t(11) = 1.2, p = 0.29, 12 of 12 subjects).
Spatial relationship of verb and motion perception regions
We first used whole-brain analysis to examine whether any of the regions that differentiate between verbs and nouns overlap, even partly, with brain regions involved in the perception of motion or biological motion (Fig. 2). There was no overlap between regions involved in basic motion perception and regions that differentiated between verbs and nouns. However, in the group average, one region of overlap was observed for verb comprehension and biological motion perception in the right PLTC (verb > nouns and biological motion > scrambled motion). Because group averaging leads to spatial blurring of nearby activations, we investigated the overlap between verb comprehension and biological motion perception in individual subjects. We calculated the percentage of all voxels active for both verb comprehension and biological motion perception (relative to the total number active in both tasks) in each individual (for details, see supplemental methods, available at www.jneurosci.org as supplemental material). On average, in the right posterior temporal lobe 3.1 ± 3.4% (range 0 to 11%) of voxels overlapped between verb comprehension and biological motion perception. In the left PLTC we observed 4.4 ± 6.8% overlap (range 0 to 22%) (Kung et al., 2007).
Finally, we asked whether the neural response differentiated high-motion and low-motion words specifically in those voxels that overlapped between verb comprehension and biological motion perception in each subject. In the right STS overlap region, high-motion and low-motion words did not differ (neither when verbs and nouns were compared separately, nor when they were combined) (t(8) < 1, p > 0.3). In the left STS overlap region, there was greater activity for low-motion than high-motion words (t(8) = −4.54, p < 0.01). (Nine of the 12 subjects had some overlap between the verb comprehension and biological motion perception, and were thus included in this analysis.)
Separable perceptual and conceptual effects in the PLTC
Numerous studies have reported a greater neural response in the PLTC to actions verbs than to animal nouns and nonword controls (Martin et al., 1995; Kable et al., 2002, 2005; Tranel et al., 2003; Tyler et al., 2003; Kemmerer et al., 2007). This response has been interpreted as activation of visual-motion regions during comprehension of action verbs, attributable to the high-motion associations of action concepts. As such, these data have been taken to support the more general claim that concepts retrieved during word comprehension are represented in sensory–motor cortices and are comprised of sensory–motor features. Contrary to these claims, we found that the PLTC response was not related to the visual-motion features of words. The PLTC regions that respond to action verbs respond more to all verbs than all nouns, regardless of whether the words have high-motion associations. For example, PLTC regions respond more to mental verbs than animal nouns, despite the fact that animal nouns are rated as having more visual-motion information. In fact, no PLTC region showed greater activity for high-motion words than low-motion words in any of the whole-brain or ROI analyses. Furthermore, none of the higher order visual regions we examined (bilateral MT+, STS, LOC, EBA) distinguished between conceptual categories based on their sensory associations. These data illustrate that the presence or absence of motion information is not a dimension along which concepts are organized in the PLTC, and, more generally, that sensory visual-motion features are not automatically retrieved during word comprehension.
One concern might be that a visual-motion effect is present in the PLTC, but we did not detect it because of a lack of power in our study. We think this is unlikely. First, we did, in fact have adequate power (0.99 and 0.92) to detect a motion effect similar in size to the grammatical class effect, in action − animal PLTC ROIs. At 0.80 power, we would be able to detect an effect that was between 55% and 83% of the grammatical class effect size. A power analysis based on the action > animal effect size reported by Kable et al. (2002) showed that we had >0.99 power in all of our ROI analyses to detect a motion effect. Perhaps even more critically, rather than failing to find an effect, we replicate previous findings of greater activity for action verbs than animals both in whole-brain and ROI analyses. We find, however, that this effect is not attributable to the motion characteristics of action verbs, but, rather, to their grammatical category.
Recently, it was reported that a temporal region shows a higher response to visually presented tool motion than to other kinds of motion (Beauchamp et al., 2002). Could this region contain the visual-motion features associated with action concepts? In our study, no area in the lateral temporal lobe responded more to names of tools than names of animals and inanimate natural kinds, nor did any region respond more to motion verbs than mental state verbs. However, a direct test of this question requires functionally identifying the tool motion area and examining whether it responds to motion words. Such a study has yet to be performed.
Is it possible that people do normally access motion representations when they access action concepts, but they were prevented from doing so by the experimental task? In the current experiment, participants made semantic-relatedness judgments, comparing pairs of words within category. Such a detailed semantic judgment required subjects to access concepts. Furthermore, a wealth of behavioral evidence demonstrates that concepts are retrieved automatically when we hear or read words, regardless of the task (Stroop, 1935; Neely, 1991). The present results therefore demonstrate that retrieving action concepts during word comprehension does not lead to activation of visual-motion features. More generally, we suggest that retrieval of sensory–motor features is not obligatory during word comprehension.
How do our findings relate to recent reports of behavioral congruence effects between perception and language? For example, in a recent study, Meteyard et al. (Kaschak et al., 2005; Meteyard et al., 2007) demonstrated that subjects were faster to indicate whether a dot display contained coherent motion, when that motion was in the same direction as a verb they heard. One interpretation of these behavioral effects, in light of our findings, is that they arise from neural loci that are distinct from perceptual cortices. For example, information about the trajectory of a verb might be represented in rich, but modality-independent, semantic representations within the PLTC. Alternatively, linguistic stimuli might affect processing of sensory–motor representations when the perceptual representations are concurrently activated by a perceptual task. Thus, these previous behavioral findings are not inconsistent with our finding that word comprehension does not entail the reactivation of sensory experiences.
It seems probable however that sensory–motor representations can be accessed in response to word stimuli during some cognitive tasks. For example, sensory–motor features might be activated during imagery, and for making judgments about detailed sensory properties of named objects (Tyler and Moss, 2001; Caramazza and Mahon, 2003; Oliver and Thompson-Schill, 2003; Tyler et al., 2003; Rogers et al., 2004; Machery, 2006, 2007). Previous studies have demonstrated right STS activity when subjects were specifically instructed to imagine motion in response to verbal cues (Grossman and Blake, 2001). Additionally, regions near MT+ may be activated when subjects hear words from an artificial lexicon whose semantic representations consist entirely of associatively learned perceptual features (Revill et al., 2008). Also, as described above, some studies have reported motor activity during passive listening to action verbs (Hauk et al., 2004). However, simply making visual-motion features of real words pertinent to a semantic decision in a purely linguistic task is not sufficient to activate area MT+ (Kable et al., 2005). An important question for future research concerns the circumstances during natural language comprehension that lead to the retrieval of sensory–motor information and the role this information plays in cognition and behavior (Mahon and Caramazza, 2008).
Verb selectivity in regions anterior to biological motion perception regions
Previous studies have reported regions in PLTC that show greater activity for verbs than nouns (Warburton et al., 1996; Perani et al., 1999; Kable et al., 2002; Davis et al., 2004; Kable et al., 2005; Bedny and Thompson-Schill, 2006; Shapiro et al., 2006; Yokoyama et al., 2006; Bedny et al., 2007). The present study demonstrates that these regions do not represent visual-motion properties, but their precise role remains unclear. Nonetheless, unlike some other category specific findings, verb effects in the PLTC are robust, consistent from study to study, and from individual to individual. Thus they provide a crucial datum that must be accounted for by any model of conceptual organization.
One possibility is that one or more of the PLTC verb regions represent the semantics of events (Wu et al., 2007). We found that some of the PLTC verb-selective regions are just anterior to biological motion perception regions in the STS. This finding is consistent with the Anterior Shift Hypothesis proposed by Thompson-Schill et al. (Thompson-Schill, 2003; Kable et al., 2005; Thompson-Schill et al., 2006). According to this hypothesis, semantic regions lie anterior to perceptual regions that are important for acquiring information about a particular category of concepts, but information in the semantic regions is organized along different dimensions from those used in the sensory regions (Thompson-Schill, 2003; Thompson-Schill et al., 2006). It is possible that during development biological motion is an important source of information for learning about actions. Bootstrapping from these action representations, abstract events (such as mental actions and changes of state) could come to be represented in this PLTC region. Critically, these PTLC event regions do not represent perceptual features. Rather, perceptual representations serve as input during development to form distinct conceptual features. This interpretation is at present highly speculative. The proximity of biological motion perception and verb regions may have evolutionary origins or may be coincidental. Future research is required to resolve this question.
One important observation is that multiple, anatomically distinct regions in bilateral PLTC showed preferential responses to verbs. Each of these regions may serve a distinct functional role. It is possible, therefore, that some of the inconsistency in the neuroimaging literature on verb processing stems from the conflation of these regions. In future studies, it will be important to distinguish these regions in individual subjects and measure their responses to both semantic and grammatical properties of verbs (morphological complexity, number of thematic and subcategorization frames, argument structure, imageability, etc.) (for similar research, see Grewe et al., 2007; Thompson et al., 2007).
Our data demonstrate that previously identified PLTC action regions respond preferentially to verbs relative to nouns, even when the nouns have higher visual-motion properties. We hypothesize that these PLTC regions play a role in representing the semantic category of events or the grammatical category of verbs. Neither visual-motion perception regions, nor PLTC action verb regions, respond preferentially to words that are high in visual-motion features. Our findings suggest that sensory features do not form the substrate of conceptual representation for word comprehension.
This work was supported in part by National Institutes of Health Grants R01 MH067008, R01 DC006842 (M.B., A.C., A.P.L.), K24 RR018875, R01 EY12091, and R21 EY0116168 (A.P.L.). We thank Susan Whitfield-Gabrieli, Lucy Chen, Jonathan K. Scholz, and the Athinoula A. Martinos Imaging Center at the McGovern Institute for Brain Research at MIT for their help with fMRI analyses and data collection. We also thank Michael Frank, Jeffrey Ellenbogen, David Kemmerer, and an anonymous reviewer for comments on previous drafts of this manuscript.
- Correspondence should be addressed to Marina Bedny, Bernson-Allen Center for Non-Invasive Brain Stimulation, Beth Israel Deaconess Medical Center, Harvard Medical School, 330 Brookline Avenue, KS-158, Boston, MA 02215.