Abstract
During visual search, the working memory (WM) representation of the search target guides attention to matching items in the visual scene. However, we can hold multiple items in WM. Do all these items guide attention at the same time? Using a new functional magnetic resonance imaging visual search paradigm, we found that items in WM can attain two different states that influence activity in extrastriate visual cortex in opposite directions: whereas the target item in WM enhanced processing of matching visual input, other “accessory” items in memory suppressed activity. These results imply that the representation of task-relevant and (currently) task-irrelevant representations in WM differs, revealing new insights into the organization of human visual WM. The suppressive influence of irrelevant WM items may complement the attention-guiding influence of task-relevant WM items, helping us to focus on task-relevant information without getting distracted by irrelevant memory content.
Introduction
When you search for an item, you maintain a representation of it in working memory (WM), which acts as a “search template” that guides attention and to which objects in the visual scene are matched (Duncan and Humphreys, 1989; Desimone and Duncan, 1995). We can store up to four items in WM (Cowan, 2001). Next to the target for current search, we can also store items in WM that only become relevant later. If all WM items have a similar status, attention would not only be guided to objects matching the search template, but also to objects matching task-irrelevant information in WM. Attentional guidance by these “accessory” memory items (MIs) would hamper task performance, as attention would be misdirected to distractors. A number of psychophysical studies (Houtkamp and Roelfsema, 2006; Olivers, 2009) showed that attentional guidance by task-relevant items is much stronger than that of accessory MIs, suggesting that they have a privileged status in WM (Olivers et al., 2011 for review). Such differential activation states of items in WM (Cowan, 1988; Oberauer, 2002, 2003) have also been observed in various WM paradigms (Lepsien and Nobre, 2007; Soto et al., 2007; Nee and Jonides, 2008; Kuo et al., 2009; Lewis-Peacock and Postle, 2012). In search tasks, the template could be in a more activated state so that it can exert a top-down, attentional influence on the visual representations, whereas the accessory items could be maintained in a dormant state that causes relatively little interference.
The neural underpinnings of these different WM states and how they influence visual processing are not well understood. The search template appears to bias competition between visual inputs toward matching items (Desimone and Duncan, 1995), as suggested by enhanced baseline activity of visual neurons tuned to process the target object (Chelazzi et al., 1993, 1998). This bias may cause a competitive advantage of these neurons over neurons processing different objects, resulting in enhanced neuronal responses to targets compared with responses evoked by distractors (Miller and Desimone, 1994). But little is known about the representation of the accessory MI. How is its interference during search tasks prevented? In principle, interference might be avoided by storing the accessory MIs in a more structural form that does not rely on persistent neuronal activity (resembling long-term memory; Mongillo et al., 2008), although there is also evidence that they are stored as persistent activity of different neurons or in different brain areas (for review, see Olivers et al., 2011).
In the present functional magnetic resonance imaging (fMRI) study, we compared the representation of the search template and accessory MIs using a “memory-loaded temporal search task” (Peters et al., 2009). Participants searched for a house or face stimulus (search target; ST), while maintaining an additional house or face (memory item; MI) in WM for a second search task (Fig. 1A). We found that task-relevant memory representations activate object-selective visual areas, whereas accessory WM items suppress activity in these regions.
Materials and Methods
Materials
Eighteen grayscale photographs (4.5 × 4.5°) of houses and 18 grayscale front-view photographs of Caucasian faces (half males) were used as stimuli (Kanwisher et al., 1997). The photographs were equal in luminance and unfamiliar to the participant. To create stimuli for the search stream, each individual face photograph was semitransparently superimposed on each individual house photograph, resulting in 324 superimposed face-house images that could be presented in the search streams. House and face stimuli were superimposed rather than presented in two separate search streams to avoid potential confounding effects of eye movements or spatial attention shifts between items. Moreover, since house and face stimuli were presented at the same location, MIs had a maximal opportunity to interfere. The relative weighting between the face (75%) and the house (25%) photograph (i.e., houses were three times as transparent as faces) was based on a behavioral pilot experiment (n = 10), to equalize task difficulty between the house detection and face detection task.
Task design
Figure 1A illustrates the design of the experiment that required subjects to perform a sequence of two search tasks on every trial. Each trial consisted of three phases (Fig. 1). In the encoding phase (Fig. 1a1), we presented two randomly chosen faces, houses, or a face and a house (1.9 s each with 100 ms in between). After 100 ms, a number (“1” or “2“) appeared (for 2 s) cueing the subjects that the first (50% of trials) or second object (the other 50%) was the ST for the first search (ST1), so that the other object was an accessory MI that had to be remembered for the second search task in which it would function as an ST. The cue followed the presentation of both objects to avoid potential differences in encoding strategies for targets of the first and second task. After a variable cue delay (4 ± 2 s), a search stream of 15 images (500 ms each; 7500 ms in total) was presented (Fig. 1a2, first search). Participants were instructed to search the ST throughout the entire stream and respond as quickly as possible with the right index finger when they detected the target. Half a second after the search stream was finished, a second cue (a 1 if the first cue was 2 and vice versa) appeared, signaling that in the second search stream the MI would become the new target (i.e., MI1 changed into ST2), whereas the ST should further be ignored (i.e., ST1 changed into MI2). Following a variable cue delay (4 ± 2 s), a second search stream was presented having similar characteristics as the first search stream (Fig. 1a3, second search), but subjects now had to search for the new ST (i.e., MI1 which is now ST2) and press the button as soon as they detected it. A small fixation cross was presented in the interval between trials (ITI; 6.5 ± 2 s). Each time interval of the cue delays and ITI occurred equally often for each search type.
Importantly, in both searches, the target (i.e., ST1 in search 1 and ST2 in search 2) and the irrelevant accessory item in WM (i.e., MI1 in search 1 and MI2 in search 2) could occur among regular distractors. More specifically, both search streams consisted of a sequence of 13 randomly chosen distractors (i.e., images that never contained the ST or MI) and two “stimuli-of-interest” (SOI). Each SOI image contained one attribute (e.g., the face) that was of interest in our analyses, whereas the other attribute (e.g., the house) of this image was a random distractor. The following combinations of SOIs could appear in a search stream: (1) the ST and MI, (2) the ST and a distractor, (3) the MI and a distractor, and (4) two distractors. Note that these prespecified distractors had the same qualities as the 13 other distractors in the stream, but were selected for further analysis to have an equal number of events (and temporal spacing) for the different SOIs. House and face images were never repeated within a stream. To optimize deconvolution analyses, the onset of the SOI presentation was synchronized to the onset of a volume measurement, with the constraint that the SOI never occurred as the first stimulus of the search stream. The interval between SOIs presented in a search stream was jittered (2 or 4 s).
The design was carefully balanced. Each combination of search types (i.e., two face search tasks, two house search tasks, first a face and then a house search task, or vice versa) was presented an equal number of times. Within each of these search combinations, the number of occurrences of the ST and MI in the stream were identical. The four SOI combinations occurred equally often, just like the combinations of SOIs presented in the first and second stream of one trial. Order of trials, order of cues, jittering of cue-delay and ITIs, selection of distractor stimuli in the search stream, and the position of the SOIs in the search stream all followed a randomization scheme in which all possibilities occurred equally often across trials.
Image acquisition
Echo-planar images (T2*-weighted; 64 × 64 imaging matrix, 28 slices, voxel size: 3.5 × 3.5 × 3.5 mm3, no gap, TR/TE = 2 s/35 ms, FA = 90°) covering almost the whole brain were collected on a 3 T Siemens Scanner (Siemens Medical Systems) using a standard head coil. Functional data were aligned to a T1-weighted high-resolution anatomical image (magnetization-prepared rapid acquisition gradient echo sequence; TR/TE = 2.3 s/3.93 ms, FA = 8°). Subjects viewed the stimuli, projected onto a frosted screen using a liquid crystal display projector (VPL-PX21; Sony), via a mirror mounted to the head coil. Stimuli were presented and responses were recorded using the Presentation software package (Neurobehavioral Systems). Stimulus presentation was synchronized with MR data acquisition.
Each of the nine healthy volunteers (five males; mean age 26.8 ± 2.7 years) performed two runs of the experiment (1230 volumes in total). Before the fMRI measurement, participants were familiarized with the stimuli and task. They were instructed to fixate on the fixation cross and to respond as fast and accurately as possible when they detected the target. A functional localizer of house- and face-preferring brain regions (160 volumes) was included in the scanning session, using a standard design in which blocks of rapid serially presented face photographs (three blocks) and house photographs (three blocks) were alternated with fixation blocks. All procedures were approved by the ethics committee of the Faculty of Psychology and Neuroscience of Maastricht University.
Data analysis
Behavioral data.
For each occurrence of a ST, MI, and regular distractor (D) in the stream, we detected whether a response was given in a 2 SD interval around the subject's mean reaction time to the STs. Responses within this interval were counted as false alarms when the item was an MI or D, and as hits when it was a ST. In contrast, the lack of responses occurring within this interval were considered correct rejections for MIs or Ds, whereas they were counted as misses in case of an ST. Reaction times of correct responses were submitted to a two-way repeated ANOVA with stream (first, second search stream) and search category (face search, house search) as factors. In addition, accuracies were analyzed by applying a three-way repeated-measures ANOVA with stream (first, second search stream), search category (face search, house search), and search item type (ST, MI, and D) as factors. Greenhouse–Geisser corrected p values are reported for tests in which the sphericity assumption was violated (according to Mauchly's test of sphericity).
fMRI data.
Preprocessing of the individual datasets included slice scan time correction; linear trend removal; temporal highpass filtering; 3D motion correction; transformation into Talairach space (Talairach and Tournoux, 1988); and cortex reconstruction, inflation, and flattening as implemented in the BrainVoyager QX software package (Brain Innovation). The first two volumes of each run were discarded to remove T1 saturation effects. No spatial smoothing was applied to the functional data, which were interpolated to a 3 × 3 × 3 mm3 voxel target resolution.
Whole-brain as well as volume-of-interest (VOI) random effects (RFX) analyses were performed to investigate (1) sustained modulations throughout search and (2) transient modulations when specific items in the search stream were encountered. Sustained neural modulation throughout search was analyzed with a two-way RFX ANOVA with category of ST (search Face, search House) and category of MI (memory Face, memory House) as factors. Correspondingly, the design matrix for sustained analyses included eight predictors modeling the target encoding (face, house), cue delay (prepare for face or house search), and presentation of search stream 1 (face and house search with face or house MI held on-line for next search). In this first analysis series, only search 1 was of interest. Therefore, the different types of search 2 were modeled with one predictor. In the second analysis series, we used an identical design matrix, but with search stream 2 split for the four different types, whereas search stream 1 was modeled with a single predictor. Furthermore, only search periods before target encounter were included in these analyses, because participants might have stopped searching after detecting a target. A separate predictor modeled the remainder of the stream period after a behavioral response was recorded. Each predictor's boxcar function was convolved with a standard two gamma hemodynamic response function. In addition to the univariate ANOVA, we performed a multivoxel pattern analysis to study differences between house and face accessory MIs with higher sensitivity. In the performed searchlight analysis (Kriegeskorte et al., 2006; BrainVoyager QX v. 2.4), a spherical aperture with a radius of 6 mm was placed at each voxel of the whole brain to detect local multivariate differences within the aperture by measuring activation pattern distances from the first search stream in which a face versus house MI was maintained.
To study transient modulations elicited by the ST, MI, and D in the search stream, we performed deconvolution analyses, which minimize the interference between responses to temporally adjacent events (Glover, 1999). Each event was incorporated in the transient analyses design matrix with six delta-function predictors, modeling each of the six time points of the elicited hemodynamic response independently (Glover, 1999). Two events represented the target encoding and the cue period (face and house searches were combined). In the first analysis series SOIs in the first search stream were modeled, whereas SOIs in the second search stream were modeled in the second analysis series. These SOIs were represented by one of the 12 “SOI events,” modeling the ST, MI, and a (prespecified) D, for the four combinations of face/house ST and MI. SOIs to which an incorrect response was given by the participant were not included in an SOI predictor. Likewise, SOIs presented after the participant's response to a search stream stimulus were also not included in one of the SOI events to ensure that the potential attentional capture of MIs was not overlooked in the event-related averages by including responses to MIs that were possibly no longer attended. All search stream periods that were not included in one of the SOI events were included in one “non-SOI” predictor. To optimize the estimation of responses to the SOIs, this non-SOI event was used as baseline instead of the ITI.
The peaks of the parahippocampal place area (PPA) and fusiform face area (FFA) responses (see below) during house and face search, respectively, were submitted to a two-way RFX ANOVA with category of MI (memory Face, memory House) and encountered item type (ST, MI, D) as factors. One subject who did not show any identifiable transient responses during the second search, in combination with a low hit rate for that search (<60%) was excluded from the analysis of transient responses in the second search.
VOI analyses were confined to the face-preferring FFA (Kanwisher et al., 1997) and house-preferring PPA (Epstein and Kanwisher, 1998), which were identified for each participant using an independent localizer run and standard mapping methods. The left FFA was excluded from further analyses, because (following the right lateralization of face processing; Cabeza and Nyberg, 1997) we could not define a significant face-specific region in the left fusiform gyrus for all participants. Furthermore, PPA activity was combined across hemispheres, because left and right PPA effects did not differ.
Results
Behavioral results
The mean reaction time (RT) to STs averaged across the first and second stream was 676 ms (±11 ms SE). The RT did not differ between house (661 ± 10 ms) and face search (691 ± 20 ms; p > 0.2), it also did not differ between the first (671 ± 16 ms) and second search stream (682 ± 10 ms; p > 0.4), and these two factors did not interact (p > 0.4).
Similar to RT, accuracy was not influenced by search category (p > 0.1), suggesting that the differential transparency between house and face images was an effective manipulation to equalize task difficulty. Accuracy was higher for the first (92.4% ±2.2%) compared with the second (89.5% ±2.1%) search stream (F(1,8) = 5.8; p < 0.05). Furthermore, there was a main effect of search item type (F(1,8) = 25.9; p = 0.001). There were no interactions between factors (all p > 0.09), except for search item type and stream number, which tended to interact (F(2,16) = 4.3 p = 0.06). Therefore, post hoc tests on the different item types were performed for the first and second search separately: the percentage of hits (STs) was lower than the percentage of correct rejections (Ds) in the first (t(8) = 4.1; p < 0.003) as well as in the second (t(8) = 5.2; p < 0.001) stream. More importantly, during the first stream, the rate of correct rejections of MI1s (97.6 ± 2.1%) and distractors (98.6 ± 0.8%) in the first search did not differ (p > 0.4). Likewise, when MI2 appeared as a distractor in the second search, it did not elicit more false alarms than Ds (correct rejection rate 97.9 ± 1.2% and 97.6 ± 0.9%, respectively; p > 0.8). Thus, the MI had a different state in WM than the ST because items in the stream that matched the MI were treated as Ds.
Together, these results indicate that the subjects memorized both the ST as well as the MI, but did not confuse them, even though the ST and MI switched roles in between the first and the second search (i.e., ST1 → MI2 and MI1 → ST2).
fMRI Results
Sustained modulations throughout search
We next investigated the effect of maintaining STs and MIs on activity in object-selective visual cortex. Specifically, we analyzed sustained activity changes during search stream 1 in the FFA and PPA, which were identified for each participant using an independent localizer run (Fig. 2A). Throughout visual search, the search template induced sustained enhanced activity in the higher visual area specialized for processing the category of the target: activity in the PPA was enhanced for a house compared with a face ST1 (F(1,8) = 64.8, p < 0.00005), whereas activity in the FFA showed the opposite pattern (F(1,8) = 150.2, p = 0.000002; Fig. 2B, compare blue-tinted vs red-tinted bars). In marked contrast, the MI representation in WM exerted a suppressive influence on ongoing activity: PPA activity was decreased for a house compared with a face MI1 (F(1,8) = 15.9, p < 0.005). Likewise, a face MI1 suppressed FFA activity compared with when a house needed to be remembered for the next task (F(1,8) = 12.6, p < 0.01; Fig. 2B, compare the difference between the light and dark bars of the same color). No interaction between influences of the search template and MI was observed in these areas (all p > 0.1).
We also performed a multivoxel searchlight analysis to explore whether there were differential activity patterns for maintaining a face compared with house MI during search, beyond the ones we could detect with univariate methods. Distinct activity patterns for face versus house MI conditions (p < 0.05) were among others observed in bilateral dorsolateral frontal cortex, higher visual areas, and basal ganglia (Table 1; only clusters with >300 voxels are reported).
Transient responses to items in the search stream
What happens when the target or MI is encountered during search? Figure 3A shows the extended target detection network that was activated when ST1 was encountered in the search stream. Activations were found in visual areas, as well as in the frontal cortex (e.g., putative location of right frontal eye field [Blanke et al., 2000]), the anterior insula, and in parietal regions (see Table 2 for details). Additional activations were revealed in cerebellar and subcortical (mainly thalamic) structures (data not shown). Finally, activations related to button presses were revealed in the left (post) central sulcus and the (pre-) supplemental motor area (Picard and Strick, 2001). In stark contrast, not a single voxel showed a significant increase in activity when MI1 was encountered.
In addition, FFA and PPA responses to individual items in the (respectively, face and house) search stream were submitted to a two-way RFX ANOVA with MI category (house, face) and search item type (ST1, MI1, and D1) as factors. In accordance to the sustained effects, transient PPA responses were suppressed if a house compared with a face MI was maintained in WM (F(1,8) = 11.1, p = 0.01). The corresponding effect in FFA did not reach significance (p > 0.1). In addition, there was a main effect of search item type in PPA (F(3,24) = 5.2, p < 0.02) and FFA (F(3,24) = 14.1, p < 0.0005) and there was no interaction between the MI category and search item type in FFA or PPA (p > 0.1). Further post hoc RFX contrasts between ST1, MI1, and distractor (D1) responses revealed that targets, but not MIs, induced enhancements (Fig. 3B): the PPA responded stronger to house ST1s than distractors in the search stream (t(8) = 3.2; p < 0.02). A similar enhanced response was observed for face ST1s in the FFA (t(6) = 3.3; p < 0.02). In stark contrast, the MI1s did not induce such a match enhancement (Fig. 3B), in line with the whole-brain analysis. That is, neural responses to MI1s in the search stream did not differ from regular distractors in PPA (p > 0.7) or FFA (p > 0.9). Further tests showed that this was the case for all four types of MIs (i.e., FFA and PPA responses to face or house MI1 during face or house search did not differ from those evoked by regular distractors). In sum, we did not observe match enhancement for input matching the MI, regardless of whether the MI belonged to the same or to a different category as the ST.
Search stream 2: sustained and transient responses
During the second search stream subjects had to stop searching for ST1, and start searching for the stimulus that had previously been MI1 which now became the target of the second search (i.e., MI1 turned into ST2). Although the main focus of this paper is on neural processes during search 1, we performed similar analyses on stream 2. Behavioral results (Fig. 1B) showed that subjects correctly updated the search template from ST1 to ST2 in the second search. This indicates that the status of items in WM can be rapidly changed in accordance with changing task demands.
This change in status of the ST and MI was reflected by the patterns of neural activity. RFX analyses on sustained responses during search stream 2 showed that activations elicited by ST2 and MI2 were highly similar to ST1 and MI1 in the first search (compare Figs. 2, 4), respectively. A house ST2 enhanced activity in PPA (F(1,8) = 82.9, p < 0.00002), whereas a face ST2 enhanced FFA activity (F(1,8) = 70.0, p < 0.00005). Moreover, a house MI2 decreased PPA activity (F(1,8) = 15.0, p < 0.005), although the same comparison in FFA failed to reach significance (p > 0.1). Finally, similar to search 1, no interaction between influences of the search template and MI was observed in any of the areas (p > 0.4). In sum, similar to search 1, the ST induced sustained enhancements (in FFA and PPA), whereas MIs caused suppression (in PPA) throughout search 2.
The observed transient modulations during search 2 further confirmed that ST2 and MI2 behaved similarly to ST1 and MI1 in search 1 (even though ST2 is the same stimulus as MI1, and MI2 is ST1). Whole-brain voxelwise regression analysis revealed a network for new targets (ST2s) compared with distractors resembling the target detection network observed for search 1. Contrasting the new MIs (MI2s) to Ds, on the other hand, did not show a comparable match detection network. In contrast to search 1, however, we did observe some small patches responding stronger to MI2, which were mainly located in prefrontal areas (patches > 10 mm2: left middle frontal gyrus x = −39, y = 10, z = 45 Talairach coordinates, 12 mm2; right middle temporal gyrus x = 60, y = −10, z = −7, 15 mm2).
Finally, we performed two-way RFX ANOVAs on transient PPA and FFA responses with category of MI and search item type as factors. PPA responses were influenced by item type (F(1,7) = 7.8, p = 0.005). Post hoc RFX tests revealed enhanced processing of items matching the new search template (ST2) compared with Ds (t(7) = 3.5; p < 0.02). In contrast, MI2 did not differ from other distractors (p > 0.6). The main effect for MI category (unlike the sustained response) did not reach significance (p > 0.1) in the RFX analysis. However, suppression by the house MI2 (25.7% lower β-weights compared with face MI2) was significant in a fixed-effects analysis (t(7) = 2.1; p < 0.05). FFA activity showed a pattern corresponding to the activity in PPA: FFA responses tended to be higher when ST2 compared with a distractor was encountered in the stream (t = 2.0; p = 0.1). In contrast, responses to MI2 were similar to responses to Ds (p > 0.3). The suppression by the face MI2 (29.1% lower β-weights compared with house MI2) did not reach significance (p > 0.2). Finally, in accordance with all previous analyses, there was no interaction between MI category and stream item type (p > 0.4).
Discussion
The present study provides new insights into the organization of WM, and the influence of its contents on attentional processing. The results show that items in memory can attain (at least) two different states, with opposing influences on processing in category-selective visual cortex: Task-relevant items in WM, such as the ST representation, enhance activity in higher visual areas. In marked contrast, items in WM that are irrelevant for the current task exert suppressive influences on higher visual areas. These complementary effects provide a solid basis for efficient search: The representation of the search template in visual cortex is enhanced, providing a competitive advantage for matching input (i.e., targets) to be selected for further processing. In parallel, the representation of the currently irrelevant memory content is suppressed, which can help to avoid the erroneous selection of irrelevant input.
Searching a target was associated with a sustained enhancement of activity in visual areas specialized in processing the target object category. Face search robustly increased FFA activity throughout search, whereas house search enhanced processing in the PPA. These modulations could not be stimulus driven, since stimulus input did not differ between conditions. Rather, top-down signals conveying information about the ST appear to drive the enhanced processing of the attribute of the superimposed images that corresponds to the target's category (Fuster et al., 1985; Rainer et al., 1999; Freedman et al., 2001). Such sustained object-based attentional modulations in specialized visual areas are consistent with previous fMRI findings (O'Craven et al., 1999; Serences et al., 2004). Task-relevant items in WM were thus strongly represented in visual areas throughout search. Moreover, transient increases in activation were observed when input that matched the ST was encountered. This selective processing of the search template agrees with predictions of the biased competition model (Desimone and Duncan 1995). Neurophysiological recordings suggest that frontal areas indeed bias activity in visual cortex to increase the activity of neurons representing the target object, thereby guiding attention to matching items in the visual display (Chelazzi et al., 1993, 1998; Bichot et al., 2005). This enhanced activity state presets neurons to quickly and effectively select matching input for further attentive processing, as reflected at the neuronal level by a transient “match enhancement” (Miller and Desimone, 1994). In addition to these enhancements in visual cortex, an extended frontoparietal network showed increased activity upon target detection, in accordance with previous studies (Jiang et al., 2000; Druzgal and D'Esposito, 2001). This network included the frontal eye field and intraparietal sulcus (Jiang et al., 2000) and overlapped with the dorsal frontoparietal network engaged in goal-directed deployment of attention (Corbetta and Shulman, 2002). In contrast, visual items matching the MI did not cause match enhancements and did not activate this target-detection network.
It is clear that the neural signatures of task-relevant and task-irrelevant WM representations differ. Whereas the task-relevant WM item increased sustained activity in category-selective visual cortex, the “accessory” WM item, merely stored for later use, suppressed activity in the same regions. That is, activity in the PPA was reduced when a house (compared with a face) was maintained in WM for a subsequent search. In the FFA, a similar effect was found when a face had to be held on-line. In both areas, this suppressive effect existed, regardless of whether the search template was a house or a face. In accordance with these results, recent fMRI findings showed that attentional processing of nonrelevant information is suppressed (Polk et al., 2008), especially if this information should not be encoded in WM (Gazzaley et al., 2005). This suppressive influence of irrelevant WM items might be beneficial. Suppressing activity of neurons that represent the task-irrelevant WM item presumably prevents the detection of visual input that matches with the accessory item, thereby increasing search efficiency. Accordingly, we did not observe neural enhancements for objects matching the accessory WM item, suggesting that this potentially interfering input is not attentively processed. Given the limited detection power of fMRI deconvolution analyses, this null effect should be interpreted with caution. However, our results agree with a previous event-related brain potential study in which we compared neural processing of MIs and Ds in a search stream with millisecond resolution across many trials, and did not observe differences between the two in any of the processing stages (Peters et al., 2009).
The suppression by accessory MIs might be related to other processes such as visual marking (Watson et al., 2003) and dimension weighting (Found and Müller, 1996), processes that enable us to improve search efficiency by ignoring specific distractors. Of special interest is the observation by Woodman and Luck (2007) that irrelevant items in WM can serve as a “template for rejection,” repelling attention from matching items in the display and facilitating search. More generally, the presence of items of a specific category in WM reduces the interference of this category on another task (Kim et al., 2005; Lavie et al., 2005; Park et al., 2007), a finding that is in accordance with such a template for rejection and also with the decrease of activity in category-specific visual cortex.
Interestingly, we also found that the former target, MI2, caused sustained suppression of activity during the second search, perhaps because it could not be immediately released from WM (Oberauer, 2001). The previous search for this item might have “refreshed” the WM representation of MI2, requiring a stronger inhibition to repel attention from MI2 if presented in search 2. This might explain the increased prefrontal activity when MI2 was indeed encountered during search 2.
Our study was not designed to elucidate where and how the search template and accessory MIs are stored. Our finding of sustained activity in object-selective cortex is compatible with recent theories suggesting that WM storage does not solely rely on prefrontal cortex, but that it also involves representations in higher visual areas (Pasternak and Greenlee, 2005; Ranganath and D'Esposito, 2005; Postle, 2006; Lewis-Peacock and Postle, 2008; Nee and Jonides, 2008; Woloszyn and Sheinberg, 2009) and even early visual cortex (Harrison and Tong, 2009). However, previous studies suggest that the prefrontal cortex plays a pivotal role in partitioning task-relevant and accessory information in WM (Soto and Humphreys, 2006; Warden and Miller, 2007, 2010). In these prefrontal areas MIs might be stored in an “orthogonal” code that does not interfere with another task (Sigala et al., 2008; Panzeri et al., 2010; Fell and Axmacher, 2011). We cannot exclude the possibility that these items are stored in a structural form composed of synaptic traces (Mongillo et al., 2008; Sugase-Miyamoto et al., 2008; Lewis-Peacock et al., 2012), which has properties in common with long-term memory. However, single-cell recordings revealed that the persistent firing of neurons in prefrontal cortex codes for the identity of an item that is stored for later use. Furthermore, our searchlight analysis revealed category-selective activation patterns in dorsolateral prefrontal cortical areas, which is in line with the hypothesis that the accessory MIs are stored as persistent activity in WM.
The accessory MI might be “forced” into its specific state by the presence of the search template (Olivers et al., 2011). Studies that used a “varied mapping” design (Schneider and Shiffrin, 1977) requiring a new search template on every trial, observed no attentional capture by MIs in the search array (Downing and Dodds, 2004). In contrast, studies using a “consistent mapping design” in which the search template occupies very little space in WM, did find attentional capture by MIs (Soto et al., 2005, 2007; Olivers, 2009). This suggests that the presence of the search template affects the state of accessory MIs. Future studies could further explore the neural effects of this apparent interaction.
In conclusion, our data suggest a subdivision in WM between task-relevant and task-irrelevant content. Task-relevant items in WM enhance activity of visual neurons processing corresponding input, whereas currently irrelevant MIs exert an opposite influence. Consequently, only input matching task-relevant MIs is selected for further processing, whereas objects matching accessory WM content do not capture attention. This dual mechanism might aid our ability to focus on relevant information while simultaneously ignoring distracting input, even when faced with stimuli matching items we deliberately hold on-line for later use.
Footnotes
This research was supported by NWO Grant No. 402-01-632 to R.G. We thank Bettina Sorger for useful suggestions and Joel Reithler for various contributions during several stages of this project.
The authors declare no competing financial interests.
- Correspondence should be addressed to Dr. Judith C. Peters, Department of Neuroimaging and Neuromodeling, Netherlands Institute for Neuroscience (KNAW), Meibergdreef 47, 1105 BA Amsterdam, The Netherlands. j.peters{at}nin.knaw.nl