Abstract
To optimize task performance as circumstances unfold, cognitive control mechanisms configure the brain to prepare for upcoming events through voluntary shifts in task set. A foundational unanswered question concerns whether different domains of cognitive control (e.g., spatial attention shifts, shifts between categorization rules, or shifts between stimulus–response mapping rules) are associated with separate, domain-specific control mechanisms, or whether a common, domain-independent source of control initiates shifts in all domains. Previous studies have tested different domains of cognitive control in separate groups of subjects using different paradigms, yielding equivocal conclusions. Here, using rapid event-related MRI, we report evidence from a single paradigm in which subjects were cued to perform both shifts of spatial attention and switches between categorization rules. A conjunction analysis revealed a common transient signal evoked by switch cues in medial superior parietal lobule for both domains of control, revealing a single domain-independent control mechanism.
Introduction
Intelligent behavior entails configuring the brain as events unfold to efficiently carry out tasks that will achieve behavioral goals. Effective task performance requires selecting relevant sensory information through deployments of selective attention (Egeth and Yantis, 1997), applying a task-appropriate cognitive operation to this information (Rogers and Monsell, 1995), and executing an appropriate response (Philipp and Koch, 2005).
Transitions from one task set to another are achieved through acts of cognitive control, which have fruitfully been studied using cued task-switching (Sudevan and Taylor, 1987) and attention-shifting (Yantis et al., 2002) paradigms. These paradigms require subjects to switch between two tasks that differ in the state of attention (e.g., attend left or right), in the categorization rule (e.g., odd/even vs high/low digit categorization), or in many other ways. We will refer to these aspects of task set (e.g., the state of visuospatial attention or categorization rule) as distinct “cognitive domains”; we will refer to a specific configuration of a domain as a domain-specific “preparatory state” (e.g., attention directed to the left, or being prepared to make an odd/even digit categorization). An entire “task set” is specified by a combination of preparatory states in each of several cognitive domains. In our paradigm (see Fig. 1b), the locus of attention and digit categorization rule are the two cognitive domains of interest, which together define four possible preparatory states (e.g., attend left and be prepared for the magnitude categorization, or Left-Mag). A voluntary transition between two task sets constitutes a “task switch” (e.g., from Left-Mag to Left-Par).
a, Behavioral task. In this example, subjects are instructed at the start of the run to attend to the left central RSVP stream and to prepare to perform the magnitude categorization. When a digit (e.g., “4”) is presented, they perform the high/low judgment on this digit. When the “R” cue is presented in the attended stream, they shift attention covertly to the right central RSVP stream and continue to prepare the magnitude categorization. When the “P” cue is presented in the attended stream, they switch to the parity categorization rule and wait for a digit to perform the odd/even judgment. Each RSVP frame lasts for 250 ms with no temporal gap between frames. Each critical event (cue or target) is separated by 3000–9000 ms. b, Four possible preparatory states subjects could occupy during the task; the double-headed arrows indicate the eight possible switches. c, Behavioral performance: response times and accuracy to targets (mean ± 1 SEM).
A major unresolved question concerns whether the control of task switches is domain-specific or domain-independent. According to a domain-specific account, each cognitive domain is controlled by its own dedicated mechanism for managing transitions between states (Rushworth et al., 2001).
In contrast, a domain-independent account holds that acts of control in any domain are implemented by a single mechanism that can be flexibly applied to initiate the required transition. Recent studies of attentional control have provided evidence for domain independence by revealing transient activity in medial superior parietal lobule (mSPL) that is time-locked to the initiation of shifts of attention within multiple perceptual domains, including space (Yantis et al., 2002), features (Liu et al., 2003), objects (Serences et al., 2004), and sensory modalities (Shomstein and Yantis, 2004).
A third, hybrid, possibility is that a common cortical locus is recruited for many domains, but distributed coding of control in that region differs from one domain to the next. Distinct but spatially interspersed populations of neurons might respond selectively during acts of control in some domains versus others.
No previous study has directly examined acts of cognitive control within two or more domains using a single paradigm. Here, we report data from a cued task-switching paradigm in which subjects either covertly shifted visuospatial attention between two locations or switched between two digit categorization rules. This paradigm permits a direct comparison of the pattern of brain activity associated with these two domains of cognitive control.
Materials and Methods
Subjects.
Sixteen right-handed neurologically healthy adults (seven females; 20–32 years of age) participated in the study. The study protocol was approved by the Johns Hopkins Medicine Institutional Review Board, and informed consent was obtained from all participants.
Stimuli and procedure.
Stimuli were rendered in white on a black background using MATLAB (The MathWorks) and Psychophysics Toolbox (Brainard, 1997) and were projected onto a screen mounted to the top of the magnet bore behind the subject's head; subjects viewed the screen reflected from a mirror at an optical distance of 68 cm. The subject held an MR-compatible response box with their left and right index and middle fingers each placed on one of four buttons. A custom-built MR-compatible infrared camera was used to monitor eye position during the task; the video signal was recorded with ViewPoint EyeTracker software (Arrington Research).
Subjects were instructed to maintain fixation at a central fixation point throughout each run. Two rapid serial visual presentation (RSVP) target streams of alphanumeric characters (250 ms per frame with no temporal gap) were located 4° of visual angle below the horizontal meridian and 4° to the left and right of the vertical meridian (see Fig. 1a). Each target stream was flanked by three distractor streams with an edge-to-edge separation of 0.5°. The distractor streams were included to compete perceptually with the target streams and maximize attention effects (Desimone and Duncan, 1995).
At the beginning of each run, an instruction screen specified the initial task set: an attended location and categorization rule (see Fig. 1a). At any given moment, the subject occupied one of the four preparatory states defined by their state of attention (i.e., covertly attended to either the left or the right RSVP target stream) and the categorization rule (i.e., parity: odd vs even; or magnitude: high, 6–9, vs low, 2–5) they were prepared to follow.
Forty-eight critical events were randomly intermixed among filler items (noncue letters) in each run; one-half of the critical events were letter cues and one-half were target digits (i.e., 2–9). The four letter cues (“L,” “R,” “M,” “P”) specified “attend to the target stream on the left,” “attend to the target stream on the right,” “prepare for magnitude categorization,” and “prepare for parity categorization,” respectively. When a cue or target appeared, they were required to (1) make transition to another state (attention shift or rule switch), (2) remain in the current state (hold), or (3) perform a digit categorization. Whether a given letter cue required a shift or hold depended on both the currently occupied state and the identity of the cue; for example, if attention was directed to the left stream and the subject was prepared to make a parity judgment, then the cues “R” or “M” required a shift to a new state, whereas the cues “L” or “P” required that they maintain the current state.
Subjects were trained to respond to the cues by pressing buttons at the same time with both index fingers (i.e., the response was the same for all cues), and to respond to target digits with either the left or right middle finger, depending on their trained stimulus–response mappings (e.g., press the left button for “high” or “odd” and the right button for “low” or “even”). The stimulus–response mappings were counterbalanced across subjects.
Cue and target events were separated in time by 3–9 s with an average of 6 s between them. Cues and targets were presented in a random order. A given sequence could include two or more cues in succession or two or more targets in succession. The six distractor streams consisted only of noncue letters. Cues and targets did not appear in the ignored target stream.
Subjects practiced outside the scanner with auditory feedback (a tone after incorrect responses to target digits) until they achieved >90% accuracy. Performance feedback was not provided in the scanner, with one exception: if an attention shift cue was missed (indicated by no response within 1500 ms since cue onset), a correction cue (i.e., a square replacing a filler letter) was presented in the currently attended stream as an instruction to redirect the subject's attention to the correct stream.
Each run in the scanner lasted for 308 s. Subject completed an average of 24 runs over two scanning sessions on 2 different days. Eye position was monitored during scanning to ensure fixation.
The 2 (domain of control: spatial attention vs categorization rule) × 2 (trial type: shift vs hold) design resulted in four cue conditions: attention shift (attSh), attention hold (attHd), categorization rule switch (rulSw), and categorization rule hold (rulHd). Because cue and target events were randomly intermixed in a run, each target either followed one of the four cues, or another same-rule target. This two-factor design (i.e., domain of control by trial type) permits a direct comparison of the cortical source of cognitive control in the two distinct domains of interest.
Imaging acquisition and processing.
MRI scanning was performed with a Philips Intera 3T scanner in the F. M. Kirby Research Center for Functional Brain Imaging at the Kennedy Krieger Institute (Baltimore, MD). Anatomical images were acquired using an MP-RAGE T1-weighted sequence that yielded images with a 1 mm isotropic voxel resolution [repetition time (TR), 8.1 ms; echo time (TE), 3.7 ms; flip angle, 8°; time between inversions, 3 s; inversion time, 738 ms]. Whole-brain echoplanar functional images (EPIs) were acquired with an eight-channel SENSE (MRI Devices) parallel-imaging head coil in 40 transverse slices (TR, 2000 ms; TE, 35 ms; flip angle, 90°; matrix, 64 × 64; field of view, 192 mm; slice thickness, 3 mm, no gap), yielding 3 mm isotropic voxel resolution.
Neuroimaging data were analyzed using BrainVoyager QX software (Brain Innovation). Functional data were slice-time and motion corrected and then temporally high-pass filtered to remove components occurring three or fewer cycles over the course of a run. To correct for between-scan motion, each subject's EPI volumes were all coregistered to that subject's anatomical scan. Finally, the images were Talairach-transformed and resampled into 3 mm isotropic voxels. No spatial smoothing was performed.
The general linear model (GLM) approach (Friston et al., 1995) was used to estimate parameter values. Two GLMs separately modeled the sustained and transient effects of cognitive control (the same pattern of results was obtained when both effects are modeled simultaneously with one GLM; we report the results from modeling with two GLMs for clarity). Sustained effects were modeled by four regressors accounting for four possible preparatory states (see Fig. 1b): (1) attending to the left location and being prepared to carry out the magnitude categorization (Left-Mag) or (2) the parity categorization (Left-Par), (3) attending to the right location and being prepared to carry out the magnitude categorization (Right-Mag) or (4) the parity categorization (Right-Par). The second GLM included a total of 16 regressors to model transient effects. Eight of them modeled cue events, accounting for the four conditions as follows: (1–2) attention shift (attSh), including shifts from left to right (sLR) and vice versa (sRL); (3–4) attention hold (attHd), including hold left (hL) and hold right (hR); (5–6) categorization rule switch (rulSw), including from magnitude to parity (sMP), and vice versa (sPM); and (7–8) categorization rule hold (rulHd), including hold magnitude (hM) and hold parity (hP).
The remaining eight regressors modeled target events, which permitted us to test hypotheses concerning the role of rule reconfiguration in accounting for residual switch costs (for details, see Results). Two of these target-related regressors (9–10) modeled trials in which an incorrect response was made for each categorization rule; the rest modeled correct target trials according to which kind of critical event (cue or target) they followed in the trial sequence. Two regressors (11–12) modeled targets that appeared after each of the categorization rule switch cues (sMP and sPM). The rest of the regressors modeled categorization rule hold targets. There are two categories of rule hold targets, because cues and targets were randomly intermixed. The first category consists of targets that followed any cue that did not require a categorization rule switch (i.e., hM, hP, hR, hL, sLR, sRL); two regressors (13–14), one for each categorization rule the subject was currently prepared to follow, modeled these targets (that is, we did not separately model these different types of cues, because they were all equivalent in not requiring a categorization rule switch). The second category consists of targets that followed targets of the same categorization rule; two regressors (15–16), one for each rule, modeled these targets.
The regressors were created by convolving a single-gamma hemodynamic response function (Boynton et al., 1996) with a boxcar function marking the temporal position of each event (250 ms) or each sustained preparatory state (variable durations). A group random effects analysis was performed, and for each contrast a minimum individual voxel threshold of p < 0.0089, uncorrected, was adopted. A minimum cluster of 7 contiguous voxels (189 mm3) was adopted to correct for multiple comparisons, yielding a whole-brain corrected statistical threshold of α < 0.01 determined by a cluster threshold estimator plug-in implemented in BrainVoyager.
Event-related average time courses of the blood oxygen level-dependent (BOLD) signal were computed across subjects for all significantly activated voxels within a region of interest, time locked to the cue or target event in question, plotted as percentage signal change relative to the mean of the run. Error bars indicate ± 1 SEM across subjects.
Results
Behavioral results
The overall accuracy across subjects, including both cues and targets, was 85% during scanner sessions. Cue and target detection accuracy and response times (RTs) are shown in Table 1(for target detection performance, see also Fig. 1c); target responses are shown separately for the cue type preceding the target in question (TT refers to a target after another target). Data from the four conditions (e.g., attSh, attHd, rulSw, rulHd) were entered into a 2 (domain of control: spatial attention or categorization rule) × 2 (trial type: shift or hold) ANOVA separately for cues and targets.
Behavioral performance in response to each cue type and to targets after each cue type
The ANOVA for responses to cues revealed a significant main effect of trial type on RT (F(1,15) = 21.05; p < 0.001): RTs to shift cues were slower than to hold cues (1049 vs 1023 ms). Because the cues did not differ in their appearance (i.e., all were letters) or motor requirements (i.e., all required the subject to press both buttons), the slowed responses to shift cues probably reflects the time required to retrieve the mental act from working memory, a shift of attention or a switch of categorization rule, that now had to be carried out. There was no main effect of domain of control (attention vs rule), no interaction on RT, and no significant main effects or interaction on accuracy (p > 0.3).
Responses to targets that followed attention cues were less accurate than those after rule cues (80 vs 84%; F(1,15) = 22.49, p < 0.001). This was not attributed to a speed–accuracy tradeoff because response times to targets after attention cues were also slower (1173 vs 1134 ms; F(1,15) = 7.42, p < 0.02). The targets were either mapped to the same response in both the magnitude and parity rules or not; for example, if low and even were mapped to a left button press, 2, 4, 7, and 9 were mapped to congruent responses and the other digits to incongruent responses. There was a main effect of congruency in RTs but this factor did not interact with any of the other factors (p > 0.05), and therefore is not considered further. The main effect of trial type on target RTs was also significant (F(1,15) = 5.23; p < 0.05); responses to targets after attSh and rulSw cues were slower than to those after attHd and rulHd cues. No other main effects or interactions were significant (all values of p > 0.1). As shown in the next paragraph, this main effect was driven by the rule switch but not by the attention shift.
Task-switching costs were evaluated by comparing responses to targets after switch cues with those after hold cues. There were two instances of those according to our conceptualization of task set. One was to compare targets after attention shift cues with those after attention hold cues, and the other was comparing targets after rule switch cues with those after rule hold cues. Only the second instance of switch costs has been reported in the literature. Note that because targets followed cues by at least 3000 ms, any observed costs are instances of “residual switch costs” (Meiran et al., 2000). A t test revealed a significant residual task switch cost of 41 ms (t(15) = 2.67; p < 0.02) in our paradigm for targets after rule switch and rule hold cues. Targets after attention shift and attention hold cues, respectively, are both instances of rule hold trials (because switch cues required either a shift of attention or a switch in categorization rule but never both). There was no significant difference in response times to targets after attention shift and hold cues (t(15) = 1.22; p > 0.24).
Neuroimaging results
Sustained effects of cognitive control
Cognitive control for attentional and categorization task states were first assessed with separate contrasts. In these contrasts, we treated the runs as consisting of a sequence of variable-length epochs (ranging in duration from 3 to as long as 18 s) during which the subject occupied one of four sustained preparatory states (Fig. 1b): attend left while preparing to carry out the magnitude digit categorization (Left-Mag), attend left-parity (Left-Par), and Right-Mag, Right-Par. A contrast of the locus of attention (i.e., Left-Mag and Left-Par > Right-Mag and Right-Par) revealed increased activation in extrastriate cortex and posterior intraparietal sulcus (IPS) contralateral to the locus of spatial attention (Table 2). Averaged event-related time course in those regions shows that, before the shift in attention, there was greater activity for sustained attention to the contralateral side of space than to the ipsilateral side (Fig. 2). After the attSh (i.e., sRL, sLR) cue, an increase in signal occurred for shifts from the ipsilateral to the contralateral side, and a decrease for shifts from the contralateral to the ipsilateral side. The hold cues were associated with sustained activity that changed little before and after the cues. This pattern confirms that subjects were covertly maintaining and shifting spatial attention from one side of the display to the other according to the attention cues, as shown in many previous studies (Yantis et al., 2002).
Sustained effects of spatial attention (attend left vs attend right) in extrastriate cortex. The event-related average time courses were computed as percentage signal change relative to the mean BOLD signal across the entire run. Time 0 is when the cue event in question occurred. hL and hR cues produced sustained contralateral activity in each area that did not change when the hold cue appeared; sLR and sRL transitioned from relatively low to sustained high levels of activity (or vice versa) after shift cues. Error bands show ±1 SEM.
Brain regions exhibiting sustained effects of attention
The contrast between categorization rules (i.e., Left-Mag and Right-Mag > Left-Par and Right-Par) did not reveal any significant activation. This suggests that no specific brain region was associated with maintaining either categorization task state exclusively.
Cue-evoked transient effects of cognitive control
To examine the transient effects of voluntary control, we analyzed event-related activity time-locked to the instructional cues. Linear contrasts were performed to identify regions of interest (ROIs) for the two domains of control (Table 3). First, we identified the brain regions involved in each domain of control with two separate contrasts (i.e., attSh > attHd and rulSw > rulHd for attention and rule, respectively). Second, we performed a conjunction analysis of these two contrasts. Finally, we extracted β weights for cue-related regressors (i.e., sLR, sRL, sMP, sPM, and the respective holds) from regions identified by the above contrasts, and tested for domain independence using post hoc t tests. A truly domain-independent region should exhibit transient responses during shift cues but little or no transient response during hold cues regardless of which cognitive domain and regardless of which specific direction of shifts. In other words, we tested for significant effects within all four pairs of shifts versus their respective holds (i.e., sLR > hL, sRL> hR, sPM> hP, and sMP > hM) in each ROI.
Brain regions exhibiting transient effects of cognitive control
Figure 3 shows the results of a contrast between attSh (i.e., sRL and sLR) and attHd (i.e., hR and hL) cues, which revealed a network of attentional control regions that included mSPL, right superior precentral sulcus (sPCS), right middle frontal gyrus (MFG), right inferior frontal gyrus (IFG), and right supramarginal gyrus (SMG) (p < 0.001, corrected). Event-related average time courses are shown for each ROI.
Sources of cognitive control for shifting spatial attention (attSh > attHd). The event-related average time courses are shown for each instance of attention shifts and holds.
A contrast between rulSw (i.e., sPM and sMP) and rulHd (i.e., hP and hM) cues revealed significant activation only in mSPL and left IPS for control of the categorization rule (Fig. 4). Event-related average time courses are shown for each ROI. Note that the spatial extent of mSPL activated by attSh versus attHd was greater than that for rulSw versus rulHd (compare Figs. 3, 4). The rulSw versus rulHd contrast evoked activity in a region that was mostly included within the attSh versus attHd contrast.
Sources of cognitive control for switching between two categorization rules (rulSw > rulHd). The event-related average time courses are shown for each instance of rule switches and holds.
We performed a conjunction analysis (Nichols et al., 2005), which included attSh > attHd and rulSw > rulHd (p < 0.001, cluster threshold corrected). The analysis revealed a region in mSPL that was activated by shifts of both spatial attention and categorization rule (Fig. 5).
Common sources of control for shifting spatial attention and switching categorization rule. The conjunction analysis revealed the overlapping activation in mSPL for attSh > attHd and rulSw > rulHd.
To investigate the degree to which shift-evoked transient signal was truly domain independent, we tested whether the average β weights for each instance of shift was greater than the respective hold (i.e., sLR > hL, sRL > hR, sPM > hP, and sMP > hM) in each ROI shown in Figures 3 and 4.
Within the ROIs identified by attSh versus attHd (i.e., right sPCS, right IFG, right MFG, right SMG, mSPL) (Fig. 3), the attention shift-related contrasts were significant (sLR > hL and sRL > hR; all values of p < 0.001). This is consistent with the fact that these ROIs were extracted from attSh > attHd contrast and confirms that the attSh > attHd effect was not driven by just one of the two attention shift directions. Within the region of mSPL activated by attention shifts, the two rule switch contrasts were significant or nearly so (sPM > hP, p < 0.09; sMP > hM, p < 0.01). Note that the spatial extent of the significantly activated region of mSPL for rule switches is smaller than that for attention shifts and that, as revealed by the conjunction analysis (Fig. 5), there exists a subregion of mSPL that is significantly activated by both domains of control. None of the remaining ROIs exhibited significant activation for all four shift contrasts.
Within the ROIs identified by rulSw versus rulHd (i.e., left IPS and mSPL) (Fig. 4), the rule switch-related contrasts were all significant (sPM > hP, values of p < 0.01; sMP > hM, values of p < 0.005). Again, this confirmed that rulSw > rulHd contrast was not driven by a particular instance of rule switches. Only mSPL exhibited significant effects for all four shift contrasts (values of p < 0.02).
fMR-adaptation analysis
The results reported here do not rule out the possibility that there exist domain-specific subpopulations of neurons within mSPL (either clustered into subregions or spatially intermingled) that were not dissociable with the conventional general linear model analysis that we applied. To investigate this possibility, we examined the data for patterns of fMR-adaptation, which is often used to infer coding of subvoxel functional units within a region (Ewbank et al., 2005; Rotshtein et al., 2005), based on the finding that repeated stimuli evoke a weaker BOLD response than nonrepeated stimuli (Grill-Spector et al., 2006). We analyzed the magnitude of the BOLD signal evoked by two-trial sequences of switch cues in which the cues required switches within the same domain (attention shift followed by attention shift or rule switch followed by rule switch) or across domains (attention shift–rule switch or vice versa). The specific instances of pairs (e.g., rule switch–rule switch, attention shift–attention shift, rule switch–attention shift, and attention shift–rule switch) were grouped into within- and across-domain pairs to maximize the sensitivity of this analysis.
If mSPL truly is domain independent, then there should be no difference in the magnitude of the BOLD signal evoked by the within- and across-domain switch pairs, because the very same neurons are active for both switches. Alternatively, if mSPL contains domain-specific subpopulations of neurons, then within-domain switches should produce less activity compared with between-domain switches (because of adaptation). The analysis revealed that the mean peak response evoked by the second cue was significantly smaller for within-domain than for across-domain pairs (t(15) = 2.63; p < 0.02) (Fig. 6). This result provides tentative evidence of greater within-domain adaptation and suggests that separate, domain-specific subpopulations of neurons may exist within mSPL.
Event-related time course in mSPL evoked by the second switch cue in a pair as a function of whether the switches were in the same or different domains. The first switch cue occurred <12 s before the onset of the second switch cue.
Relationship between cortical activity and behavior
We next examined the degree to which the magnitude of the reconfiguration signal (a possible index of task set reconfiguration during rule switches, reflected in our behavioral finding of rule switch costs) is related to behavioral performance (specifically, target RT) within trial couplets containing a rule switch cue followed by a target. The idea was to determine whether a cue that evoked a relatively large transient BOLD signal in mSPL on trials requiring a shift in categorization rule (e.g., from the odd/even to the high/low rule) led to more effective states of preparation and therefore better performance in the digit categorization task compared with trials in which the switch cue evoked a relatively small signal in mSPL.
The trials were grouped into quartiles according to target response time (i.e., fastest 25% of RTs, and so forth). The mean cue-evoked BOLD activity was computed for trials within each quartile in the rule-switching control areas, identified by rule switches versus holds (left IPS and mSPL) (Fig. 4). Figure 7 shows the event-related time course associated with the fastest and the slowest RT quartiles, respectively. A 2 (RT bin: fastest and slowest) × 6 (time point: 0∼6 TRs, 0–12 s) ANOVA was performed in each region. The interaction between RT bin and time point was significant in both regions (F(5,15) = 3.55, p = 0.002 for IPS; F(5,15) = 2.65, p = 0.015 for mSPL). A post hoc paired t test revealed that rulSw cues in the fastest RT bin evoked greater BOLD response than those in the slowest RT bin at 6–8 s after the cue onset (mSPL: 8 s, t(15) = 2.9, p = 0.011; left IPS: 6 s, t(15) = 2.77, p = 0.014; 8 s, t = 2.06, p = 0.057, one-tailed test). Figure 7 also shows the combined time course of the two intermediate RT quartiles; the peak of this response falls between that of the fastest and the slowest quartiles. Furthermore, the same analysis performed on a non-rule switches-specific region (e.g., sPCS, right SMG, right IFG, and right MFG) revealed no significant effects (all values of p > 0.5). This analysis suggests that the degree to which a task switch is successful (as measured by RT to the following target) is related to the magnitude of the transient signal in mSPL and in left IPS.
Event-related time course evoked by rule switch cues for trials in which the subsequent target response time fell within the fastest or slowest RT quartiles and the average of two middle quartiles. a, mSPL; b, left IPS.
Target-evoked transient effects of cognitive control
According to the two-component task set reconfiguration hypothesis (Rogers and Monsell, 1995), voluntary cognitive control is triggered by the advance task cue but is typically incomplete. This leads to a residual switch cost even with a relatively long preparation interval [e.g., over 3 s in the study by Meiran et al. (2000)]. To fully prepare for the new task that is associated with a different rule, it is necessary to actually carry out the task at least once; therefore, task set reconfiguration cannot be fully completed until the target appears after a switch cue and a response is made to it.
To examine this hypothesis, we contrasted the β weights associated with the appearance of the first target after a rule switch cue (which we designate TrulSw) with those for targets that followed other same-task targets (TT). Targets after other cues (i.e., after attSh, attHd, rulHd cues) were also instances of task repetitions; however, the significantly faster RTs associated with TT trials compared with TrulSw (993 vs 1154 ms; p < 0.05) and as well as TrulHd trials (993 vs 1113 ms; p < 0.05) suggest that TT trials are instances of a fully prepared state. It is important to emphasize that this contrast focuses on target-evoked activity, unlike the previous contrasts, all of which involved cue-evoked activity.
This contrast revealed increased activity only in presupplementary motor area (pre-SMA) (Table 3). Within this region, β weights for all target regressors in pre-SMA were extracted and entered into a 2 (categorization rule: magnitude vs parity) by 3 (trial type: TrulSw, TrulHd, TT) ANOVA. This analysis revealed a significant main effect of trial type (F(2,15) = 12.28; p < 0.001). The main effect of rule and the interaction were not significant (p > 0.5). The two categorization tasks were collapsed for calculating event-related time courses for the three trial types. Figure 8 shows a transient increase in activity evoked in this region by targets after a rule switch cue (TrulSw), but not by targets after a hold cue or another target repetition targets (TrulHd and TT, respectively). Post hoc tests revealed no difference in average peak response (4–6 s) between TrulHd and TT (t(15) = 1.54; p > 0.05) but significantly greater peak responses for TrulSw versus TrulHd (t(15) = 2.12; p = 0.05) and versus TT (t(15) = 3.05; p < 0.01). This transient signal may reflect the target-evoked “final reconfiguration” required after a categorization rule switch and is consistent with the two-component account of task switching (Rogers and Monsell, 1995).
Execution-dependent cognitive control for rule switching. Mean event-related time course evoked by targets that followed a rule-switch cue (TrulSw), a rule-hold cue (mean of TrulHd), or a same-rule target (TT) in pre-SMA.
Discussion
The present study reveals a common neural substrate in mSPL for the control of shifting spatial attention and switching between categorization rules. Several previous studies of cognitive control in a variety of domains have reported activity in posterior parietal cortex (Dove et al., 2000; Kimberg et al., 2000; Sohn et al., 2000; Vandenberghe et al., 2001; Brass and von Cramon, 2002; Dreher et al., 2002; Yantis et al., 2002; Braver et al., 2003; Liu et al., 2003; Serences et al., 2004; Shomstein and Yantis, 2004, 2006; Kelley et al., 2008). Here, we show for the first time, within a single paradigm, that these two domains of cognitive control recruit the same cortical region, implicating a domain-independent reconfiguration signal that initiates both perceptual attention shifts and categorization rule switches.
The fMR-adaptation analysis revealed greater adaptation within than across domains, suggesting that there may exist within mSPL domain-specific subpopulations of neurons. Nevertheless, the common cortical locus for these two domains of control, medial SPL, is consistent with the similar functions they subserve. Shifting and maintaining visuospatial attention requires the visual system to actively select a preferred sensory representation from among competitors. Similarly, maintaining and switching between categorization rules requires active selection of one rule from among competitors. The control of selection in both of these domains can be conceptualized according to the biased-competition model of attention (Desimone and Duncan, 1995) in which competition for perceptual representation is biased by a top–down signal driven by current task goals.
The larger magnitude of the transient response in mSPL after attSh than rulSw cues is consistent with the fact that attention shifts and rule switches drive different patterns of activity, as shown by the fMR-adaptation result. Several possible accounts of this magnitude difference can be considered. The cortical reconfiguration required to shift attention from one location to another may be much more extensive (i.e., modulation of large swaths of extrastriate cortex) than that required to switch between two different categorization rules. Another possibility is suggested by the failure-to-engage hypothesis of task switching (De Jong, 2000), which suggests that subjects might not always exert control even when appropriate. If, on some proportion of trials, subjects fail to switch rules immediately after the cue (and instead wait until the target appears), but always shift attention immediately after the cue, then the resulting magnitude of the mSPL response would be smaller for rule switches than for attention shifts. Assessment of these possibilities will require additional investigation.
Each of the two domains of cognitive control we examined also evoked domain-specific activity. For example, covert shifts of attention evoked domain-specific transient activity in sPCS. The involvement of sPCS in covert shifts of spatial attention has been observed in several studies (Serences and Yantis, 2007; Ikkai and Curtis, 2008; Kelley et al., 2008) (for review, see Thompson et al., 2005). This area has been implicated as the human homolog of the macaque frontal eye field (FEF) (Paus, 1996; Rosano et al., 2002; Koyama et al., 2004). Single-cell recording in monkeys has characterized a subpopulation of neurons in FEF subserving visual selection (Schall, 2004; Thompson et al., 2005). In contrast, during categorization rule switching (but not attention shifting), we observed transient activity in left IPS, echoing other task-switching studies that have used nonperceptual tasks [Kimberg et al. (2000), left superior parietal lobule; Sohn et al. (2000), left posterior parietal].
The limited temporal resolution of fMRI precludes strong inferences about the temporal order of domain-independent and domain-specific control signals. If there exist domain-specific subpopulations of neurons within mSPL (as suggested by the fMR-adaptation results), it is possible that signals in mSPL activate domain-specific control regions for switching to the subsequent task state. Alternatively, signals in mSPL might be subject to competition between task states arising from a variety of biasing influences originating in prefrontal regions or basal ganglia (Cisek, 2007). Detailed investigation of these possibilities will require methods with improved temporal and spatial resolutions in future studies.
We did not observe transient effects of control for switching categorization rules in the prefrontal cortex (PFC), which has been hypothesized to involve abstract, higher-order control (Badre, 2008). This absence of activation does not, of course, imply that PFC is not involved in task control. In our paradigm, the key contrast was switch versus hold. It could well be the case that regions of PFC identified in previous studies are active during cue interpretation both for the maintenance of a task set as well as for reconfiguration. We can conclude only that PFC does not exhibit robust switch-specific transient signals. Many other aspects of cognitive control implemented in PFC (e.g., conflict monitoring, cue interpretation, maintaining alertness and an overall task set, and so forth) are evidently fairly consistently active during both switch and hold events in our paradigm.
Previous studies have reported sustained responses in the anterior prefrontal cortex when subjects were switching between two tasks (mixed-task blocks) relative to when they were performing single-task blocks (Braver et al., 2003). Similarly, Cole and Schneider (2007) devised a target-switching task in a visual search paradigm and identified a network of cortical regions for cognitive control, including PFC, pre-SMA, anterior cingulate cortex, posterior parietal cortex, and dorsal premotor cortex. They found sustained activation during extended intervals of target switching but not during nonswitch intervals. Both of those studies compared activity during a sustained epoch in which subjects were constantly switching between two tasks or holding a single task; the contrast between them may have in part reflected different working memory requirements in the two cases. In our study, subjects experienced only mixed-task runs. Furthermore, we used a rapid event-related design to isolate the transient activity during a single trial rather than in a block. The absence of sustained PFC activation differences for the different categorization rules in our study implies that the working memory demand within our mixed-task runs was similar for the magnitude and parity rules, respectively.
We also explored a proposed execution-dependent component of cognitive control [an “exogenous” form of control (Rogers and Monsell, 1995)]. This hypothesis was motivated by the behavioral observation that switch costs (i.e., RT differences between switch and repeat trials) persist even with long preparation intervals and high motivation (Nieuwenhuis and Monsell, 2002). If a distinct mechanism is required to fully implement a task set once the target is presented, then the first target after a switch cue should engage this mechanism but not other repetition targets. Our analysis identified pre-SMA with this execution-dependent reconfiguration function (Fig. 4), consistent with previous findings (Jäncke et al., 2000; Crone et al., 2006; Slagter et al., 2006); its function is thought to involve the instantiation of the correct stimulus–response mapping for the new task.
Finally, we observed a correlation between the magnitude of the switch-related cortical response and behavioral performance. Fast RTs in the categorization task were associated with a comparatively larger transient response to switch cues in mSPL and in left IPS. This suggests that the magnitude of the transient reconfiguration signals in those control regions may reflect the effectiveness of endogenous preparation for the upcoming new task.
We conclude that mSPL is the neural substrate for a domain-independent transient reconfiguration signal that initiates shifts in attention and in categorization rules. The presence of domain-specific adaptation effects suggests that mSPL may contain subpopulations of neurons whose activity differentially encodes the cognitive reconfiguration that is required. This pattern of results supports a framework for cognitive control in which preparatory cues initiate the reconfiguration of task set via the activation of mSPL, which may participate in canceling the previous task set, selecting the new one, or both.
Footnotes
-
This work was supported by National Institutes of Health Grant R01-DA13165 (S.Y.). We thank A. Greenberg, B. J. Rosenau, and M. Esterman for comments and advice.
- Correspondence should be addressed to Yu-Chin Chiu, Department of Psychological and Brain Sciences, The Johns Hopkins University, 3400 North Charles Street, Baltimore, MD 21218. yuchin{at}jhu.edu