Abstract
When an object is described as changing state during an event, do the representations of those states compete? The distinct states they represent cannot coexist at any one moment in time, yet each representation must be retrievable at the cost of suppressing the other possible object states. We used functional magnetic resonance imaging of human participants to test whether such competition does occur, and whether this competition between object states recruits brain areas sensitive to other forms of conflict. In Experiment 1, the same object was changed either substantially or minimally by one of two actions. In Experiment 2, the same action either substantially or minimally changed one of two objects. On a subject-specific basis, we identified voxels most responsive to conflict in a Stroop color-word interference task. Voxels in left posterior ventrolateral prefrontal cortex most responsive to Stroop conflict were also responsive to our object state-change manipulation, and were not responsive to the imageability of the described action. In contrast, voxels in left middle frontal gyrus responsive to Stroop conflict were not responsive even to language, and voxels in left middle temporal gyrus that were responsive to language and imageability were not responsive to object state-change. Results suggest that, when representing object state-change, multiple incompatible representations of an object compete, and the greater the difference between the initial state and the end state of an object, the greater the conflict.
Introduction
Event comprehension requires the ability to keep track of multiple representations of an object as it is altered in state or location. Recent work on language-mediated eye-movements suggests that the mental representation of a described object is dissociable from the perceived object in a concurrently presented visual scene, and suggests further that multiple representations of (all or parts of) the same object in different states may compete and interfere with one another during event processing (Altmann and Kamide, 2009).
On reading “The squirrel will crack the acorn,” we must represent that the acorn existed in distinct states: cracked and intact. If an immediately succeeding sentence reads “And then, it will lick the acorn,” the cracked state must be retrieved; if, instead, that sentence reads “But first, it will lick the acorn,” the intact state must be retrieved. Now consider replacing “The squirrel will crack the acorn” with “The squirrel will sniff the acorn.” Regardless of the “but first” or “and then,” there is no conflict regarding the nature of the acorn's representation to be retrieved. We hypothesize that reading about the cracked acorn will recruit brain regions usually associated with conflict resolution, whereas reading about the sniffed acorn will not.
Here, we test the proposal that selecting from among distinct states of the same object will selectively recruit prefrontal cortex regions sensitive to semantic conflict, and that this increased activation will overlap on a subject-specific basis with conflict-dependent activation in a standard interference task. Event comprehension trials for each experiment varied in the degree to which a described object was changed in state. In Experiment 1, the same object was changed either substantially or minimally by one of two actions (“crack” or “sniff”). In Experiment 2, the same action (“stomp on”) either substantially or minimally changed one of two objects (an egg or a penny). Conflict-dependent fMRI data collected during a Stroop color-word interference task was used to create subject-specific regions of interest (ROIs) in left posterior ventrolateral prefrontal cortex (pVLPFC), a brain area responsive to semantic conflict (Thompson-Schill et al., 2005). We additionally examined activation in two other ROIs: (1) voxels in left middle frontal gyrus (MFG) that were responsive to Stroop conflict but unresponsive to sentence comprehension; (2) voxels in left middle temporal gyrus (MTG) responsive to sentence comprehension but unresponsive to Stroop conflict.
In each experiment, the rated degree to which an object changed in state during an event, but not the rated imageability of the described action, parametrically predicted the amplitude of the BOLD response in left pVLPFC voxels most responsive to Stroop conflict. In contrast, object state-change did not predict activation in either left MFG or left MTG; in left MTG we instead observed an effect of action imageability. Across complementary manipulations of action (Experiment 1) and object (Experiment 2), the consistent linear effect of object state-change on conflict-responsive areas of left pVLPFC indicates that multiple states of an object do compete during event processing when the object is changed from its original state.
Materials and Methods
Subjects.
Sixteen right-handed native English speakers (9 female), aged 18–28 years, participated in Experiment 1, and a separate sample of 16 right-handed native English speakers (8 female), aged 19–33 years, participated in Experiment 2. Two additional subjects from Experiment 2 were excluded from data analysis and replaced due to unusually poor performance on the event comprehension task; one subject correctly identified fewer than half of the catch trials; the other subject had a false-alarm rate that was 10 times the average of all Experiment 2 subjects. All fMRI subjects were paid $20 per hour and were recruited from within the University of Pennsylvania community. Subjects gave informed consent as approved by the University of Pennsylvania Institutional Review Board. Additionally, 522 University of Pennsylvania undergraduate students participated for course credit in an online task used for stimulus norming (273 subjects in Experiment 1; 249 subjects in Experiment 2). All subjects spoke English as a first language.
Event stimuli.
Event comprehension items for each experiment consisted of two sentences describing a person or an animal acting upon a single object. Across conditions in each experiment, the object acted upon was either minimally or substantially changed in the first-sentence event. In Experiment 1, we varied the first-sentence action to induce the state-change manipulation; the object acted upon was identical for both “substantial state-change” and “minimal state-change” conditions (e.g., “The squirrel will crack/sniff the acorn”). In Experiment 2, we held the action constant across conditions, and varied the object to induce the state-change manipulation (e.g., “The girl will stomp on the penny/egg”). By separately varying the described action and the described object across Experiments 1 and 2, we avoid changes in pVLPFC activation being due to changes in the verb alone (Experiment 1) or the object alone (Experiment 2); that is, we test whether object state-change drives conflict-dependent pVLPFC activation independent of variations in either action or in object. Table 1 shows example items from each experiment, along with the object state-change and action imageability ratings corresponding to those items.
The first-sentence verb for each item in Experiment 1 was matched across conditions on lexical ambiguity, measured as the number of distinct meanings (t(238) = 0.75, p = 0.45; Burke, 2009), and on frequency of use (t(238) = 1.00, p = 0.32; Brysbaert and New, 2009). The object referred to in each item in Experiment 2 was similarly matched across conditions on both lexical ambiguity (t(198) = 0.30, p = 0.77), and frequency (t(198) = −0.89, p = 0.37). The action described in the second sentence was identical across conditions in both experiments, and always minimally affected the object. In Experiment 1, the temporal phrase at the beginning of each second sentence was either “but first” or “and then.” We included this manipulation to test the additional hypothesis that the “crack … but first” cases would engender increased activity compared with the “crack … and then” cases because of the need to switch the focus from the newly changed state to the previous (unchanged) state. However, we observed the same pattern of neural activity for both the “and then” and “but first” conditions in Experiment 1, and the contrast of “but first” versus “and then” did not reveal any reliable clusters of increased activation anywhere in the brain. (Though not reliable after correcting for multiple comparisons, the largest cluster of increased activation for conditions that required temporal resequencing was in left posterior superior temporal sulcus, an area often linked to speech processing as well as to theory of mind; cf. Hein and Knight, 2008). In Experiment 2, we kept temporal context constant across items by always beginning the second sentence with “and then.” For both experiments, subjects were exposed to all stimuli and all conditions in a fully factorial repeated measures design, but never saw more than one version of each stimulus.
Event ratings.
Object state-change and action imageability ratings were collected through online surveys for the first and second sentence of each item in each experiment. Each survey subject rated only one alternative sentence of each item. For object state-change ratings, subjects rated “the degree to which the depicted object will be at all different after the action occurs that it had been before the action occurred.” Subjects rated each item on a 7-point scale ranging from “just the same” to “completely changed.” For action imageability, subjects rated “how much a sentence brings to mind a clear mental image of a particular action.” Subjects rated each item on a 7-point scale ranging from “not imageable at all” to “extremely imageable.”
Object state-change and action imageability ratings for the first-sentence events included data from 85 subjects for Experiment 1, and 101 subjects for Experiment 2. The first-sentence event in the “minimal state-change” condition received an average object state-change rating of 1.97 (SD = 0.57) in Experiment 1, and 2.78 (SD = 0.79) in Experiment 2. The first-sentence event in the “substantial state-change” condition received an average object state-change rating of 4.64 (SD = 0.84) in Experiment 1, and 4.96 (SD = 0.74) in Experiment 2. Object state-change ratings varied broadly within the “minimal state-change” and “substantial state-change” conditions (Fig. 1); the overall difference in object state-change between conditions was reliable in each experiment (p values <0.001). The average first-sentence action imageability rating for the “minimal state-change” condition was 4.89 (SD = 0.64) in Experiment 1, and 5.57 (SD = 0.42) in Experiment 2. For the “substantial state-change” condition, the average first-sentence action imageability rating was 5.46 (SD = 0.41) in Experiment 1, and 5.59 (SD = 0.47) in Experiment 2. The difference in action imageability between conditions was reliable in Experiment 1 (p < 0.001), but was not reliable for Experiment 2 (p = 0.18). For both experiments, object state-change correlated with neither frequency nor lexical ambiguity (see above, Event stimuli; p values >0.4).
Object state-change and action imageability ratings for the second-sentence events included data from 95 subjects for Experiment 1, and 98 subjects for Experiment 2. The second-sentence events of all items in both experiments were designed to involve minimal object state-change. To confirm that there were no differences between conditions, we used a separate online survey to collect object state-change and action imageability ratings for these events. In Experiment 1, the second sentence of each item was identical across conditions, and had an average object state-change rating of 1.90 (SD = 0.47), and an average action imageability rating of 4.52 (SD = 0.69). For the second sentence in Experiment 2, which had a different object in the “minimal” and “substantial” state-change conditions, the average object state-change rating was 1.69 (SD = 0.44) for “minimal state-change” items and 1.74 (SD = 0.49) for “substantial state-change” items, while the average action imageability rating was 4.23 (SD = 0.83) for “minimal state-change” items, and 4.13 (SD = 0.84) for “substantial state-change” items. Experiment 2 items did not reliably differ across conditions in either the object state-change or the action imageability of the second-sentence event (p values >0.3).
For each experiment, we additionally collected ratings for the likelihood that the second sentence of each item would follow the first sentence of that item (if that first sentence had been read, for example, in a magazine or newspaper). We used separate online surveys to collect data from 93 subjects for Experiment 1 (which included 4 conditions), and 50 subjects for Experiment 2 (which included 2 conditions). The average likelihood rating across “minimal state-change” event sequences was 4.06 (SD = 0.78) in Experiment 1, and 4.12 (SD = 0.90) in Experiment 2. The average likelihood rating for “substantial state-change” event sequences was 4.08 (SD = 0.78) in Experiment 1, and 4.28 (SD = 0.95) in Experiment 2. There was no statistical difference between state-change conditions of either experiment in the rated likelihood of the event sequences (p values >0.2).
Event comprehension task.
The event comprehension task in each fMRI experiment was separated into five runs, with an equal number of trials of each condition in each run. Experiment 1 included 120 experimental trials split across four conditions. Experiment 2 included 100 experimental trials split across two conditions. Additionally, subjects in each experiment read 15 “catch trials” in which the second-sentence event of the trial was implausible given the first-sentence event (e.g., “The mother will eat the sandwich. And then, she will serve the sandwich.”). The trial structure was identical in the two experiments. Each trial lasted six seconds, during which the first sentence was presented for three seconds, followed by the second sentence for three seconds. Subjects pressed the two outer buttons of a keypad when the second-sentence event was implausible given the first-sentence event. Trials were separated by 3–15 s of jittered fixation, optimized for statistical power using the OptSeq algorithm (http://surfer.nmr.mgh.harvard.edu/optseq/). Stimuli were presented using E-Prime (Psychology Software Tools).
Stroop color-word interference task.
After the event comprehension task, subjects in each experiment performed a 10 min button-press Stroop color identification task, based on previously described procedures (Milham et al., 2001; January et al., 2009). The response box for this task was restricted to three buttons: yellow, green, and blue. Stimuli included four trial types: response-eligible conflict, response-ineligible conflict, and two groups of neutral trials. Subjects were presented with a single word for each trial, and instructed to press the button corresponding to the typeface color of each word. Conflict trials could be either response-eligible or response-ineligible. For response-eligible conflict trials, the color term matched one of the subject's possible responses (i.e., yellow, green, or blue), but always mismatched the typeface color. For response-ineligible conflict trials, the color term (orange, brown, or red) mismatched the typeface color, and also was not a possible response. Separate sets of noncolor neutral trials (e.g., farmer, stage, tax) were intermixed with either response-eligible conflict trials or response-ineligible conflict trials. Both response-eligible and response-ineligible conflict trial types have previously been demonstrated to induce conflict at nonresponse levels, while response-eligible conflict trials additionally induce conflict at the level of motor response (Milham et al., 2001). To optimize power for identifying subject-specific conflict-responsive subregions of left pVLPFC and left MFG, we considered only the main effect of conflict trials versus neutral trials.
Imaging procedure.
Structural and functional data were collected on a 3-T Siemens Trio system and an eight-channel array head coil. Structural data included axial T1-weighted localizer images with 160 slices and 1 mm isotropic voxels (TR = 1620 ms, TE = 3.87 ms, TI = 950 ms). Functional data included echo-planar fMRI performed in 44 axial slices and 3 mm isotropic voxels (TR = 3000 ms, TE = 30 ms). Twelve seconds preceded data acquisition in each functional run to approach steady-state magnetization.
Data analysis.
Image preprocessing and statistical analyses were performed using AFNI (Cox, 1996). Functional data were sinc interpolated to correct for slice timing, and aligned to the mean of all functional images, using a six parameter iterated least-squares procedure. The functional data were then registered with each subject's high-resolution anatomical dataset, and normalized to a standard template in Talairach space. Finally, functional data were smoothed with an 8 mm FWHM Gaussian kernel, and scaled to percentage signal change. Each two-sentence trial was modeled as a 6 s boxcar function convolved with a canonical hemodynamic response function, with an additional covariate in the subject-wise parametric analysis to model the degree of object state-change (or action imageability) of the item for each trial. Beta coefficients were estimated using a modified general linear model that included a restricted maximum likelihood estimation of the temporal auto-correlation structure, with a polynomial baseline fit, and the motion parameters and global signal as covariates of no interest.
Our analyses are focused on three ROIs, one (pVLPFC) that is our primary region of interest and two (left MFG and left MTG) that serve as controls for our purposes. Stroop-conflict ROIs in both left pVLPFC and left MFG were functionally defined separately for each subject using data obtained during the Stroop color-word interference task. Additionally, each Stroop-conflict ROI was anatomically constrained based on probabilistic anatomical atlases (Eickhoff et al., 2005) transformed into Talairach space. Left pVLPFC was defined as the combination of pars triangularis (Brodmann area 45), pars opercularis (Brodmann area 44), and the anterior half of the inferior frontal sulcus. Across subjects from both experiments, the anatomical definition of left pVLPFC included an average of 784 voxels (SD = 35). Left MFG included portions of Brodmann areas 6, 9, 10, and 46. Across subjects from both experiments, the anatomical definition of left MFG included an average of 962 voxels (SD = 40). Across subjects from both experiments, the anatomical definition of left MTG included an average of 644 voxels (SD = 33). Within these broad anatomical boundaries, each Stroop-conflict ROI comprised the 50 voxels with the highest t-statistics in a subject-specific contrast of conflict trials versus neutral trials in the Stroop color-word interference task, while the sentence-comprehension ROI comprised the 50 left MTG voxels with the highest t-statistics in a subject-specific contrast of all event comprehension trials (averaged across conditions) versus baseline. Although analyses are reported for ROIs of 50 voxels, the same statistical patterns were consistently observed across a broad range of ROI sizes. All statistical tests for each ROI were evaluated at the two-tailed 0.05 level of significance. Finally, we assessed the object state-change effect in each voxel across the whole brain, corrected for multiple comparisons, which we report at the end of the Results.
Results
Stroop color-word interference task
Across Experiments 1 and 2, subjects correctly answered 98% of all trials. The average response time was 706 ms for conflict trials and 656 ms for neutral trials (t(31) = 6.60, p < 0.001). In a group-level contrast that included all subjects from both experiments, the most reliable cluster of voxels with an activation difference between conflict trials and neutral trials was centered between the inferior frontal gyrus (pars triangularis) and the inferior frontal sulcus of left pVLPFC (Fig. 2A). Additional clusters of increased activation for conflict trials relative to neutral trials were observed in left MFG and left intraparietal sulcus.
Stroop-conflict ROI in left pVLPFC
To determine voxels most responsive to conflict on an individual subject level, we identified for each subject the 50 left pVLPFC voxels with the highest t-statistics in a contrast of conflict trials versus neutral trials in the Stroop interference task. The location of the top 50 conflict-responsive voxels varied widely across subjects, with slightly more cross-subject overlap in the most posterior area of left pVLPFC, at the junction of pars triangularis, pars opercularis, and the inferior frontal sulcus (Fig. 2B). Within each subject-specific Stroop-conflict ROI in left pVLPFC, we examined the effect of object state-change on the amplitude of the BOLD signal.
Experiment 1 event comprehension (object fixed, action varied)
Subjects correctly identified 97% of catch trials in the Experiment 1 event comprehension task, and committed false alarms (i.e., classifying a noncatch trial as implausible) on <2% of experimental trials. There was a slightly but reliably greater number of false alarms for the substantial state-change trials (2%) than for the minimal state-change trials (1%; t(15) = 2.46, p = 0.03). Due to the small numbers involved, this difference was also tested in a χ2 test, and was also found to be significant (χ2 = 9.62, p = 0.002). False alarm trials, along with catch trials, were coded separately for all fMRI analyses.
The average signal change across all sentence conditions was reliably above baseline in the left pVLPFC Stroop-conflict ROI (t(15) = 8.59, p < 0.001), indicating that this ROI was generally responsive during sentence comprehension. Because action imageability ratings were correlated with object state-change (r = 0.50), we removed variance predicted by the action imageability ratings before comparing the “substantial state-change” and “minimal state-change” conditions, though including action imageability as a covariate did not influence the reliability of any effects. A significant main effect for object state-change emerged within the left pVLPFC Stroop-conflict ROI (t(15) = 2.50, p = 0.02; Fig. 3A), but there was no effect for temporal order (“and then” versus “but first”) and no interaction (p values >0.4). Next, we used the data from ratings of object state-change and action imageability to examine the relationship between these stimulus dimensions and signal change within the left pVLPFC Stroop-conflict ROI. Analyses separately tested the reliability of object state-change and action imageability effects across subjects and across items. Because we did not find an effect of the temporal context of the second sentence (either “but first” or “and then”), we averaged across these temporal conditions in each Experiment 1 parametric analysis that used the object state-change or action imageability ratings.
In a subject-wise parametric analysis, we measured the extent to which, for each subject, the BOLD signal amplitude within the left pVLPFC Stroop-conflict ROI varied in proportion to either object state-change or action imageability. Data were separately modeled for each subject, using one covariate to model each trial presentation, and a second covariate to model the degree of object state-change (or action imageability) of the item for each trial. Estimation of these β coefficients converged with results from the categorical analyses above, as object state-change stimulus ratings reliably predicted left pVLPFC signal amplitude (t(15) = 3.44, p = 0.004). In contrast, action imageability ratings did not reliably predict left pVLPFC signal amplitude (t(15) = −0.27, p = 0.79). Moreover, across a broad range of ROI sizes, object state-change reliably predicted signal within the left pVLPFC Stroop-conflict ROI, while action imageability did not reliably predict activation (Fig. 3B). Interestingly, while object state-change consistently predicted left pVLPFC signal amplitude, both across subjects and across ROI sizes, there was much greater variance across subjects in the degree to which the action imageability ratings predicted signal. This may reflect individual experiential differences across subjects. To further visualize this dissociation, we binned the items into quartiles according to either the object state-change or the action imageability ratings of the stimuli (Fig. 3C).
In an item-wise analysis, we measured the extent to which, for each item averaged across subjects, BOLD signal amplitude within the left pVLPFC Stroop-conflict ROI could be predicted by the stimulus ratings. Data were separately modeled for each trial, and then individual β coefficients were binned by item across subjects. Because each of the 120 items included 2 state-change versions (i.e., “substantial state-change” and “minimal state-change”), and because each subject read only one version of each item, there were 238 degrees of freedom in the Experiment 1 item analysis, and the average percentage signal change of each item was composed of data from 8 of the 16 subjects. Object state-change ratings correlated with percentage signal change in the left pVLPFC Stroop-conflict ROI (r(238) = 0.15, p = 0.02), while action imageability ratings did not predict signal (r(238) = 0.01, p = 0.82; Fig. 3D).
Experiment 2 event comprehension (object varied, action fixed)
Subjects correctly identified 92% of catch trials in the Experiment 2 event comprehension task, and committed false alarms on 2% of experimental trials, with an equal number of false alarms for the substantial state-change and minimal state-change conditions (t(15) = 1.21, p = 0.25; χ2 = 1.87, p = 0.17). As in Experiment 1, false alarm trials were coded separately, along with catch trials, for all fMRI analyses.
All Experiment 1 effects of object state-change on activation in the left pVLPFC Stroop-conflict ROI replicated in Experiment 2. As in Experiment 1, the average percentage signal change across conditions was reliably different from baseline (t(15) = 6.65, p < 0.001). With action imageability covaried out, there was a reliable categorical effect of the “substantial state-change” condition versus the “minimal state-change” condition (t(15) = 3.03, p = 0.008; Fig. 4A). In the subject-wise parametric analysis, object state-change reliably predicted ROI activation (t(15) = 2.98, p = 0.009), while action imageability did not (t(15) = −0.37, p = 0.71). As in Experiment 1, this pattern was reliable across a broad range of ROI sizes, with greater variance across subjects in the action imageability parameter estimate than in the object state-change parameter estimate (Fig. 4B). In the item-wise analysis, object state-change ratings reliably predicted percentage signal change in the left pVLPFC Stroop-conflict ROI (r(198) = 0.24, p < 0.001), while action imageability did not predict signal change (r(198) = 0.00, p = 0.98; Fig. 4D).
Though the same verb was used across conditions in the first sentence of each Experiment 2 item, individual verbs may have multiple action connotations. To control for the potential variability of action connotation, a large subset of the Experiment 2 stimuli (60 of the 100 total items) were matched as nearly as possible on the specific action connotation of the first-sentence verb. In the item-level analysis, the pattern of results in the left pVLPFC Stroop-conflict ROI for this subset of the stimuli was identical to that of the full Experiment 2 stimulus set of 100 items for object state-change (r(58) = 0.27, p < 0.001), and for action imageability (r(58) = 0.00, p = 0.99).
Comparisons across ROIs
As is evident in Figure 2A, the group-level analysis of the Stroop color-word interference task revealed a separate cluster of conflict-responsive voxels outside of left pVLPFC, in left MFG. Likewise, brain areas other than left pVLPFC, including left MTG, were generally active during sentence reading. We analyzed data from the left MTG region in particular, because of its putative involvement in semantic memory (cf. Martin, 2007). To examine task-related effects in conflict-responsive MFG regions and language-responsive MTG regions, we identified for each subject the 50 left MFG voxels with the highest t-statistics in a contrast of conflict trials versus neutral trials in the Stroop task, and the 50 left MTG voxels with the highest t-statistics in a contrast of all event comprehension trials (averaged across conditions) versus baseline (Fig. 5A). As was the case in left pVLPFC, the location of the top 50 conflict-responsive voxels in left MFG, and the top 50 language-responsive voxels in left MTG, varied widely across subjects (Fig. 5A).
Unlike the pVLPFC region described earlier, these two control regions responded to only one of our two functional localizers: The Stroop-conflict ROI in left MFG was not on average responsive during sentence reading, while the sentence-comprehension ROI in left MTG was not responsive to Stroop conflict. The average left MFG signal change across all sentential conditions was not reliably different from baseline in either Experiment 1 (t(15) = −0.02, p = 0.98) or Experiment 2 (t(15) = −1.10, p = 0.29). Likewise, left MTG signal change was not reliably different between Stroop conflict trials and neutral trials in either Experiment 1 (t(15) = 0.55, p = 0.59) or Experiment 2 (t(15) = −1.29, p = 0.22). Within these subject-specific ROIs, we repeated for each experiment the subject-wise and item-wise analyses described above for object state-change and action imageability.
For each experiment, we used an ANOVA to test for the interaction between region (pVLPFC, MFG, and MTG) and the degree to which the object state-change ratings predicted BOLD response amplitude. Object state-change β coefficients differed significantly across ROIs for both Experiment 1 (F(2,30) = 3.98, p = 0.03), and Experiment 2 (F(2,30) = 3.74, p = 0.04). Planned comparisons further revealed that object state-change did not reliably predict signal amplitude in either the left MFG Stroop-conflict ROI (Fig. 5B) or the left MTG sentence-comprehension ROI (Fig. 5C). For Experiment 1, β coefficients for object state-change were reliably different between left MTG and left pVLPFC ROIs (t(15) = 3.43, p = 0.004), while the difference between left MFG and left pVLPFC β coefficients did not reach significance (t(15) = 1.17, p = 0.26). For Experiment 2, object state-change β coefficients in both left MTG (t(15) = 2.36, p = 0.03) and left MFG (t(15) = 2.44, p = 0.03) were reliably different from pVLPFC. Object state-change β coefficients were not reliably different between left MTG and left MFG in either experiment (p values >0.1).
We conducted a similar set of analyses to examine interactions between region and imageability. The action imageability β coefficients did not reliably differ across ROIs for either Experiment 1 (F(2,30) = 1.97, p = 0.16), or Experiment 2 (F(2,30) = 1.31, p = 0.28). Experiment 1 planned comparisons, however, revealed a negative correlation between action imageability ratings and left MTG response amplitude (t(15) = −2.2, p = 0.04), while the difference between action imageability β coefficients in left MTG and left pVLPFC was marginally reliable (t(15) = 1.77, p = 0.10). In Experiment 2, in which the variance of the action imageability ratings was more constrained (σ2 = 0.37 for Experiment 1; σ2 = 0.18 for Experiment 2), MTG β coefficients for action imageability did not reliably differ from either baseline or from any other ROI (p values >0.1).
Whole-brain conjunction analysis of Experiments 1 and 2
To compare the influence of object state-change on neural activity across the two experiments, we first covaried out activation predicted by the action imageability ratings for each experiment, and then measured the extent to which activation of each voxel was predicted by the object state-change ratings (correcting for multiple comparisons). Both experiments showed extensive change-related activity in left pVLPFC (Fig. 6A; Table 2). Additionally, there was an interaction between Experiment and the object state-change effect in the right inferior parietal lobule, an area specifically implicated in studies of gesture recognition and body schema, in which action understanding is independent of objects (Hermsdörfer et al., 2001; Chaminade et al., 2005). Right supramarginal gyrus was significantly more responsive to object state-change in Experiment 1, in which the described action varied across “substantial state-change” and “minimal state-change” conditions, than in Experiment 2, in which the described action was identical across conditions (Fig. 6B).
Discussion
Tracking objects across events requires maintaining multiple representations of the same object in different states. We demonstrate that this component of event cognition elicits a neural response in left pVLPFC that overlaps with increased activation for conflict trials in a Stroop color-word interference task. Through analysis of rated stimulus norms, we further observe that the degree to which an object is changed during an event parametrically predicts the BOLD response amplitude in left pVLPFC voxels most sensitive to Stroop conflict; the rated imageability of the action does not. In Experiment 1, the described object was identical for the “substantial state-change” and “minimal state-change” conditions; the state-change manipulation was thus driven by the described action. In Experiment 2, the described action was identical across conditions; the state-change manipulation was driven instead by the affordances of the described object.
Convergence across experiments demonstrates the generalizability of the effects of object state-change on semantic conflict. By varying the number of voxels included in the left pVLPFC Stroop-conflict ROI, we demonstrate that this effect is robust within subjects across a wide range of ROI sizes. Moreover, the reliable item-wise correlations between object state-change ratings and BOLD response amplitude in the left pVLPFC Stroop-conflict ROI suggests that the effects generalize across a diverse stimulus population of actions, objects, and events, and highlights the utility of item analysis of fMRI data (Bedny et al., 2007).
In each experiment we observe a dissociation among three sets of voxels: (1) voxels in left pVLPFC that are sensitive to Stroop conflict and are activated above baseline during sentence comprehension; (2) voxels in left MFG that are sensitive to Stroop conflict but are not activated above baseline during sentence comprehension; and (3) voxels in left MTG that are not sensitive to Stroop conflict but are activated above baseline during sentence comprehension. In each experiment, object state-change ratings parametrically predicted BOLD amplitude in the left pVLPFC Stroop-conflict ROI, while action imageability ratings did not. This functional dissociation within the left pVLPFC Stroop-conflict ROI is in stark contrast to patterns of results in both left MFG and left MTG.
In the left MFG Stroop-conflict ROI, which was responsive to Stroop conflict but not to sentence reading, neither object state-change nor action imageability reliably predicted BOLD amplitude. While left MFG has been shown to be responsive to Stroop conflict beyond the level of motor response (Milham et al., 2001), it is generally not associated with semantic conflict (Binder et al., 2009), and dissociates from left pVLPFC with respect to item-specific memory interference, as evidenced by neuroimaging (D'Esposito et al., 1999), patient lesion (Thompson-Schill et al., 2002), and transcranial magnetic stimulation (Feredoes and Postle, 2010) studies. Instead, posterior-most areas of left MFG, where we observe the greatest cross-subject overlap of this ROI, may be specifically involved in maintaining task representations (Derrfuss et al., 2005).
In the left MTG sentence-comprehension ROI, which was not responsive to Stroop conflict but was generally responsive during sentence reading, object state-change did not predict BOLD amplitude in either experiment. However, in Experiment 1, the rated imageability of the described action negatively correlated with MTG signal. Because event comprehension places a stronger demand on semantic retrieval processes when it is more difficult to bring to mind a clear mental image of the described action, the negative correlation of action imageability ratings with left MTG activation is concordant with studies of left MTG responsiveness to difficulty manipulations in semantic retrieval tasks (Whitney et al., 2011). The absence of an action imageability effect in Experiment 2 is predicted by reduced variance of the action imageability ratings (the described action was fixed across state-change conditions). The modulation of left pVLPFC and left MTG by object state-change and action imageability respectively, replicates previous dissociations between these regions (Thompson-Schill et al., 1999; Bedny et al., 2008), indicating functionally distinct contributions of these regions to event comprehension.
In contrast to left MTG and left MFG, left pVLPFC is consistently shown to be central in resolving competition among incompatible semantic representations (Thompson-Schill et al., 2005). Neuroimaging, patient lesion, and transcranial magnetic stimulation studies demonstrate that left pVLPFC is activated during and is necessary for overriding misinterpretations of syntactically ambiguous sentences (January et al., 2009), selecting context-appropriate meanings of ambiguous words (Metzler, 2001; Hindy et al., 2009), completing sentences that have multiple alternative responses (Robinson et al., 1998, 2005), generating verbs with many semantic competitors (Thompson-Schill et al., 1997), and resolving working memory interference in item recognition (Feredoes et al., 2006).
Stepping back from the ROIs, and examining activation across the entire brain, we see that voxels sensitive to the object state-change manipulation overlapped across experiments in left pVLPFC. In contrast, areas of the inferior parietal lobe that were sensitive to the state-change manipulation in Experiment 1 were not sensitive to this manipulation in Experiment 2. Because the described action varied across conditions in Experiment 1, but was fixed across conditions in Experiment 2, this dissociation is consistent with literature that associates these inferior parietal lobe areas with action representation independent of the objects acted upon (Glover, 2004).
Stepping back further, and considering the theoretical implications of these data, correlations between rated degree of object state-change and BOLD response in the left pVLPFC Stroop-conflict ROI may at first seem consistent with an account that the more an object is changed in state during the first sentence of a trial, the more information must be inferred to derive the context-appropriate representation of the same object in the second sentence. This would predict, however, an interaction with temporal context in Experiment 1, because in the “and then” case, the state computed at the end of the first sentence is identical to that referred to at the end of the second (but would be different in the “but first” case). There was, however, no such interaction. Additionally, Experiment 2 participants only ever read “and then” versions of the stimuli, encouraging maintenance of only the changed instantiation, yet we still observed evidence of conflict. Alternatively, one might suppose that the more an object is changed in state, the more information must be kept in memory. This would not predict any interaction with temporal context. However, the left pVLPFC has previously been shown to be associated with resolving interference in working memory independently of working memory itself (Thompson-Schill et al., 2002). Thus, the location in which we observe sensitivity to object state-change, as well as the functional specificity of the ROI to Stroop conflict, suggests that our data do not reflect memory load.
We conjecture instead that multiple instantiations of the same object (whether of the object representation in its entirety, or of components of the object representation) must be represented when the object is described as changing in state, and that there is interference between these instantiations. This could include interference between the sensorimotor instantiations of the different affordances associated with distinct object states, mediated by the event representations within which multiple object instantiations are distinguished (cf. Zwaan and Radvansky, 1998). Because objects were generally changed from a canonical state to a marked state, the strength of the initially activated object representation may modulate the extent to which this initial representation remains active even after the contextually appropriate object representation has been computed. And while language and memory research (Bower, 2000; Van Dyke and McElree, 2006) has shown evidence of similarity-based interference between actively maintained object representations, we find that the more dissimilar the “before” and “after” instantiations of an object, the greater the interference. This difference between distinct objects (similarity-based interference) and distinct instantiations of a single object (dissimilarity-based interference) may have its roots in the fact that the distinct instantiations of an object across event-time (i.e., the “before” and “after”) are mutually exclusive—they cannot coexist. Distinct objects, on the other hand, can coexist no matter how similar; the greater the overlap between the objects' representations, the greater the interference, but differences between the objects do not have consequences for coexistence and are not inhibitory. When we need to categorize distinct representations as instantiations of a single object, left pVLPFC may act as a top-down modulatory signal to bias candidate representations—and the neural patterns that instantiate them—toward the context-appropriate representation of the object, performing a similar interference resolution process as described for other forms of ambiguity resolution (Thompson-Schill and Botvinick, 2006).
Our ability to comprehend, represent, recall, and narrate events is a quintessentially human ability. Yet the representation of multiple instantiations of the same object across “event time” (i.e., before, during, and after the event occurs), and how these may compete with one another, is a topic that has not received attention in cognitive psychology. Together, data reported here suggest that the need to represent the same object in different states comes at a competitive cost. The work reported here is a step toward identifying these representational mechanisms, and speaks to future cognitive models of object and event representation, allowing more detailed exploration of the representations over which the human cognitive system operates.
Notes
Supplemental material for this article is available at http://www.psych.upenn.edu/stslab/assets/pdf/Hindy_StateChange_stimsets.pdf. The material includes a full stimulus set for each experiment. The stimulus set for Experiment 1 includes 120 items. The stimulus set for Experiment 2 includes 100 items. This material has not been peer reviewed.
Footnotes
This research was funded by an NIH Award to S.L.T.-S. (R0I DC009209), Economic and Social Research Council awards to G.T.M.A. (RES-063–27-0138 and RES-062–23-2749), and a National Science Foundation graduate research fellowship to N.C.H. We are grateful to Kara Cohen for help with stimulus development, and to Xin Kang for help with stimulus norming.
- Correspondence should be addressed to Nicholas C. Hindy, Department of Psychology, University of Pennsylvania, 3720 Walnut Street, Philadelphia, PA 19104. hindy{at}psych.upenn.edu