Abstract
Many important situations require human observers to simultaneously search for more than one object. Despite a long history of research into visual search, the behavioral and neural mechanisms associated with multiple-target search are poorly understood. Here we test the novel theory that the efficiency of looking for multiple targets critically depends on the mode of cognitive control the environment affords to the observer. We used an innovative combination of electroencephalogram (EEG) and eye tracking while participants searched for two targets, within two different contexts: either both targets were present in the search display and observers were free to prioritize either one of them, thus enabling proactive control over selection; or only one of the two targets would be present in each search display, which requires reactive control to reconfigure selection when the wrong target has been prioritized. During proactive control, both univariate and multivariate signals of beta-band (15–35 Hz) power suppression before display onset predicted switches between target selections. This signal originated over midfrontal and sensorimotor regions and has previously been associated with endogenous state changes. In contrast, imposed target selections requiring reactive control elicited prefrontal power enhancements in the delta/theta band (2–8 Hz), but only after display onset. This signal predicted individual differences in associated oculomotor switch costs, reflecting reactive reconfiguration of target selection. The results provide compelling evidence that multiple target representations are differentially prioritized during visual search, and for the first time reveal distinct neural mechanisms underlying proactive and reactive control over multiple-target search.
SIGNIFICANCE STATEMENT Searching for more than one object in complex visual scenes can be detrimental for search performance. Although perhaps annoying in daily life, this can have severe consequences in professional settings such as medical and security screening. Previous research has not yet resolved whether multiple-target search involves changing priorities in what people attend to, and how such changes are controlled. We approached these questions by concurrently measuring cortical activity and eye movements using EEG and eye tracking while observers searched for multiple possible targets. Our findings provide the first unequivocal support for the existence of two modes of control during multiple-target search, which are expressed in qualitatively distinct time-frequency signatures of the EEG both before and after visual selection.
Introduction
Baggage scanning, medical image screening, and sports match refereeing are just a few of the activities in which human observers are required to look for multiple relevant visual signals. Studies of visual search behavior have found that multiple-target search comes with performance costs (Maljkovic and Nakayama, 1994; Houtkamp and Roelfsema, 2009; Menneer et al., 2009; Dombrowe et al., 2011; Grubert and Eimer, 2013; Mitroff et al., 2015; Liu and Jigo, 2017). Targets are detected more slowly, and are more often missed when observers try to search for more than a single object simultaneously. These costs emerge in particular when a target changes between consecutive searches compared with when it repeats, suggesting a differential prioritization of the different target representations (Found and Müller, 1996; Huang and Pashler, 2007; Kristjánsson and Campana, 2010; Olivers et al., 2011). However, other studies have reported evidence that two different target objects can be found interchangeably without switch costs, thus supporting theories which state that multiple targets can be prioritized equally in parallel (Beck et al., 2012; Grubert and Eimer, 2015; Beck and Hollingworth, 2017; Kristjánsson et al., 2018).
Recent behavioral evidence from our laboratory indicates that the environmental context, and the type of cognitive control mechanisms it allows for, is an important determinant of switch costs in multiple-target search (Ort et al., 2017, 2018). Using a gaze-contingent search task in which observers looked for two different targets, we found that saccade latencies were prolonged when the target changed from one trial to the next, but only so when either one of the targets was available per display. In this context, target switches are necessarily imposed upon the observer. If the wrong target happens to be prioritized, this requires a reactive reconfiguration to select the unanticipated target (cf. Found and Müller, 1996; Monsell, 2003). By definition, such a reactive control process can only start after display onset, resulting in time costs. In contrast, we found that when both sought-for targets were available in each display, observers still frequently switched from one target to the other, but now without switch costs. With the foreknowledge of this target availability, observers can freely choose which target to select next. They can thus switch proactively, before each display, with little to no switch cost as a result.
This difference between enforced, reactive control and free, proactive control has been proposed before in the context of task switches (Braver, 2012), distractor suppression (Geng, 2014), and spatial cueing (Taylor et al., 2008), but its role in visual search is currently unknown. Moreover, because saccades are only the end result of the selection process, our previous findings provide at best an indirect measure of assumed modes of control. To provide a more direct measure of cognitive states both before and after target switches, we used a hybrid approach of concurrently measuring both eye gaze and the electroencephalogram (EEG) of participants instructed to look simultaneously for two different color-defined targets. Specifically, we tested the hypothesis that free target choice during search is supported by endogenously triggered, proactive control that may be akin to internally driven, voluntary action selection (Forstmann et al., 2007; Frith and Haggard, 2018), and most likely originates in medial and lateral frontal cortical areas (Taylor et al., 2008; Schuck et al., 2015; Wisniewski et al., 2015). Crucially, such a proactive control signal should already emerge before a target switch. Reactive control has previously been tied to an increase in oscillatory power in the theta frequency range (3–8 Hz) over prefrontal brain areas after an unexpected task-switch (Cunillera et al., 2012), a novel stimulus (Cavanagh et al., 2011), or response conflict (Cohen, 2014a). Yet its role in visual target selection is unknown. Here, we hypothesized that such reactive control-related signal changes might also occur in multiple-target search, but only after a target switch, and specifically when such switches are imposed.
Materials and Methods
Participants.
Thirty healthy human participants (18 male) with normal or corrected-to-normal vision participated in this study for course credit or monetary compensation. The study was conducted in accordance with the Declaration of Helsinki and was approved by the faculty's Scientific and Ethical Review Board (VCWE). Written informed consent was obtained.
Task.
Participants performed two conditions of a multiple-target gaze-contingent visual search task (Ort et al., 2017) in a blocked-design. The two versions differed in whether both or only one of two memorized search targets were available for selection in the subsequent search displays. The following task settings and stimulus parameters were identical across these two conditions.
The stimulus set consisted of six colored disks extending over a visual angle of 1.3°. The RGB values of these colors were (0, 128, 175) for blue, (196, 79, 104) for red, (79, 123, 51) for green, (163, 107, 34) for brown, (142, 101, 183) for purple, and (120, 120, 120) for gray. All colors were isoluminant (M = 20 cd/m2). The background color was black (0, 0, 0).
After fixation drift correction (see Apparatus and eye tracking), a block began with a fixation cross for 500 ms, followed by a cue display for 2500 ms and another fixation cross for 500 ms (Fig. 1). The cue display consisted of two colored disks 1.06° to the left and right of fixation and indicated the two target colors for the upcoming sequence of 40 search displays. The search displays each consisted of four colored and two identical gray disks, arranged in a hexagonal lattice with vertical rows and each at a distance of 3.9° from the hexagon's center, which coincided with the fixation cross. Because of their regular positioning within a hexagon, the complete lattice on which stimuli could appear resembled a honeycomb structure. Participants were instructed to make a single eye movement toward a disk that matched either one of the target colors. The other items were distractors, not to be fixated. After target fixation, the stimuli were removed from the display and the fixated target was replaced by a white, filled circle, spanning 0.2°, to provide participants with a fixation point while participants waited for the next search display. If gaze position was not further than 1.95° away from this fixation point, the next display appeared after 850–1050 ms (randomly jittered). If participants failed to fixate this point for 5000 ms, a warning message appeared in the middle of the screen for 1000 ms, reminding them to look at the fixation point while waiting for the next search display. Because the coordinates of the previously fixated target determined the position starting point for the next display (the center of the hexagonal lattice), the search moved across the screen throughout a block, resembling natural eye movements during visual search when all items are present simultaneously. When the stimulus sequence approached an edge or the corner of the screen, the target (or targets) were randomly assigned to one (or two) of the three positions in the hexagon that were closest to the center of the screen, such that the next fixation would be directed away from the edge or corner. Although in such case the number of positions at which the targets could appear was thus reduced, participants still could not predict where exactly a specific color would appear.
Fixations had to land within a 2° visual angle radius around the target to be counted as valid. This ensured that fixations for targets and/or distractors could never overlap. If participants fixated one of the distractors, they received auditory feedback and were required to make a corrective eye movement toward a target. The search was aborted if no target was fixated within 3000 ms, and a new search display appeared.
There were two main factors. First, at the block level, target availability was manipulated. In the Free selection condition, both cued target colors appeared in the search display together with two gray and two colored distractors. In contrast, in the Imposed selection condition, only one of the two cued colors appeared in the search display together with two gray distractors and three colored distractor. Note that distractor colors remained fixed at the block level, and could be target colors in other blocks. The second factor was whether target color selection switched or repeated. Note that this latter factor was determined by either the observer (Free selection condition), or by a random sampling procedure, in which a sequence of target switches and target repetitions was randomly drawn (with replacement) from a pool of potential sequences (Imposed selection condition). Note that only the sequence of target switches and repeats was replayed, not the specific colors or positions of the search items, so that participants could not anticipate where a particular search target would appear. To match switch rate and streak length (successive switch or repeat trials) between conditions, sequences that were obtained during Free selection blocks were used to constitute the pool of replay sequences for Imposed selection blocks, for each participant separately. The pool of replay sequences to draw from would grow as the experiment progressed. Because at the outset of the experiment we did not have any sequences yet to fill the pool with, we initialized the pool with four prespecified random sequences of target switches and repetitions (one each for 6, 8, 10, and 12 switches per block). Having a small proportion of fully random sequences also further prevented participants from recognizing the order of switches and repetitions in the sequences, while still closely matching switch rates between conditions. A paired sample t test showed only a marginal (nonsignificant) difference between switch rates in the two conditions (t(29) = 1.91, p = 0.07; Free selection: 31.2%, Imposed selection: 28.9%). As a double-check, we also asked participants after the experiment whether they were aware of this replay manipulation in the Imposed selection blocks, and none of them were.
In total, there were 40 blocks consisting of 40 search displays each. The five potential target colors were combined into 10 unique two-color cue combinations. Per target availability condition, each of these combinations was used twice as the pair of target colors for a block. Before the experiment started, observers practiced two blocks of both the Free and Imposed selection conditions.
Apparatus and eye tracking.
The experiments were designed and presented using OpenSesame v3.1.4 (Mathôt et al., 2012) in combination with PyGaze v0.6, an eye-tracking toolbox (Dalmaijer et al., 2014). Stimuli were presented on a 22 inch (diagonal) Samsung Syncmaster 2233RZ with a resolution of 1680 × 1050 pixels and refresh rate of 120 Hz at a viewing distance of 75 cm. Eye movements were recorded with the SR Research EyeLink 1000 tracking system at a sampling rate of 1000 Hz and a spatial resolution of 0.01° visual angle. The experiments took place in a dimly lit, sound-attenuated room. The experimenter received real-time feedback on system accuracy on a second monitor located in an adjacent room. After every block, eye-tracker accuracy was assessed, and improved as needed by applying a 9-point calibration and validation procedure.
EEG recording and cleaning.
Concurrently with the eye-tracking (ET) data, EEG data were acquired at 512 Hz from 64 channels (using a BioSemi ActiveTwo system) placed according to the international 10-20 system, and from both earlobes (used as reference). Off-line, EEG and ET data were first coregistered using the EYE-EEG toolbox v0.4 (Dimigen et al., 2011) for EEGLAB v12.0.2.3b (Delorme and Makeig, 2004) in MATLAB 2014a and 2015a (MathWorks). All standard settings of the EYE-EEG tutorial were used (http://www2.hu-berlin.de/eyetracking-eeg); the minimum plausible interval between saccades was set to 50 ms; from clusters of saccades within this interval, only the first was stored. Quality of EEG-ET synchronization was visually inspected using recommendations from the EYE-EEG tutorial; all datasets showed good synchronization and eye movement properties (i.e., fixation heat maps and saccade angular histograms).
Next, EEG data were high-pass filtered at 0.5 Hz before time-frequency analysis to remove drifts and other non-stationarities (Cohen, 2014b), and 2.5 Hz solely for independent component analysis (ICA) to improve its signal-to-noise ratio (Winkler et al., 2015; O. Dimigen, personal communication). Continuous EEG data were epoched from −2.5 to 3 s surrounding the onset of the search display (to avoid edge artifacts resulting from wavelet filtering, see below). The vertical electro-oculogram was recorded from electrodes located 2 cm above and below the right eye, and the horizontal EOG was recorded from electrodes 1 cm lateral to the external canthi. The EOG data were used together with EEG and ET data for automatic detection of oculomotor independent components (see below). Epochs were baseline-normalized using the whole epoch as baseline, which has been shown to improve (Groppe et al., 2009). Before cleaning, the data were visually inspected for malfunctioning electrodes, which were temporarily removed from the data (17 of 30 participants had 1–3 malfunctioning electrodes).
To detect epochs that were contaminated by muscle artifacts, we used an adapted version of an automatic trial-rejection procedure as implemented in the Fieldtrip toolbox (Oostenveld et al., 2011), using a 110 and 140 Hz pass-band to capture high-frequency muscle activity, and allowing for variable z-score cutoffs per participant based on the within-subject variance of z-scores. This procedure resulted in an average of 7.86% rejected trials (min–max across participants: 1.56–21.38%). After trial rejection, we performed ICA as implemented in EEGLAB only on the clean EEG, and EOG electrodes. Next, correlations between ET and independent components were used to automatically detect oculomotor artifacts, using the variance-ratio criterion suggested by Plöchl et al. (2012) and as implemented in EYE-EEG; we removed on average 3.63 components (min–max across participants: 1–5). Finally, the malfunctioning electrodes identified before ICA were interpolated using EEGLAB's eeg_interp.m function.
We only selected those trials that had a “clean” saccade-trajectory from initial fixation after search-display onset, to final fixation on a target-matching disk (which marked search-display offset). That is, intermediate fixations within such a trajectory had to fall within 30° around a straight line from initial fixation to (1 of 2) target(s). Trials that did not meet this criterion may have had trajectories in which saccades were first drawn toward distractors, even though they finally landed on a correct target. This selection procedure together with EEG artifact rejection resulted in an average of 360 (min–max across participants: 142–599) repeat and 145 (28–316) switch trials in the Free selection condition; Imposed selection condition: 351 (177–521) repeat and 113 (45–188) switch trials being retained.
For time-frequency analyses, first the surface Laplacian of the EEG data was estimated (Perrin et al., 1989; Kayser and Tenke, 2015), which sharpens EEG topography and filters out distant effects that may be due to volume conducted activity from deeper brain sources (Oostendorp and van Oosterom, 1996; Winter et al., 2007). The Laplacian can thus be interpreted as a spatial high-pass filter. For estimating the surface Laplacian, we used a 10th-order Legendre polynomial, and lambda was set at 10−5.
Behavioral analysis.
Our main behavioral variable of interest was trial-averaged latencies of the first eye movement (dwell time before a saccade toward a target was executed). Mean saccade latencies were computed separately for the Free and Imposed selection blocks, and separately for repeat trials (selected target color at trial N was the same as the selected target color at trial N − 1) and switch trials (selected target color at trial N was different from the selected target color at trial N − 1). We took the first saccade after search display onset, provided that it met the selection criterion as described above. Next, a saccade latency filter was applied, in which saccades quicker than 100 ms and slower than 3 SD above the block mean for that participant were excluded (average of 2.5% of all trials). Average saccade latencies per participant were entered in a repeated-measures ANOVA with factors trial type (repeat and switch) and condition (Free and Imposed), using JASP v0.9 (https://www.jasp-stats.org).
EEG time-frequency decomposition.
Epoched EEG time series were decomposed into their time-frequency representations with custom-written MATLAB code (github.com/joramvd/tfdecomp). Each epoch was convolved with a set of complex Morlet wavelets with frequencies ranging from 1 to 40 Hz in 50 linearly spaced steps. Wavelets were created by multiplying perfect sine waves (ei2πft, where i is the complex operator, f is frequency, and t is time) with a Gaussian (e−t2/2s2, where s is the width of the Gaussian). The width of the Gaussian was set according to s = δ/2πf, where δ represents the number of cycles of each wavelet, linearly spaced between 3 (for 1 Hz) and 12 (for 40 Hz) to have a good tradeoff between temporal and frequency precision. From the complex convolution result Zt (downsampled to 40 Hz), an estimate of frequency-specific power at each time point was defined as [real(Zt2) + imag(Zt2)]. Single-trial power at each time-frequency point was used for a linear discriminant classification analysis (see below). Trial averaged power at each time-frequency point was decibel normalized according to 10 × log10(power/baseline), where for each channel and frequency, the condition averaged power during the entire trial served as baseline activity. We chose this baseline procedure because in a fast-paced saccade-driven trial design there is no optimal neutral baseline time window in, e.g., the intertrial interval, because of potential condition differences in both prestimulus and presaccadic and postsaccadic activity. Some baseline normalization procedure is nonetheless necessary to transform frequency-specific power to one common scale (i.e., to remove the 1/f scaling of power), and to correct for single-trial outliers (raw power cannot go <0 but can take relatively large values). Importantly, our main dependent variable was the difference in time-frequency power between switch and repeat trials (switch > repeat), in which any common deviation from “baseline” was subtracted out.
EEG multivariate pattern analysis.
In addition to univariate time-frequency analysis on each single electrode, we applied a backward decoding classification algorithm (linear discriminant analysis with all 64 channels as features and “switch” and “repeat” as classes, on time-frequency decomposed power. The goal of this analysis was to test whether a classifier could learn from spatial patterns of power modulations in specific time-frequency intervals, whether a participant at a single trial was going to repeat target selection or switch to a different target, and whether this would differ between the Free and the Imposed selection condition. Moreover, given that we used prestimulus activity, we could also test whether classifiers could predict such a choice before it happened.
For this analysis we used ADAM (the Amsterdam Decoding and Modeling toolbox http://www.fahrenfort.com/ADAM.htm), a freely available MATLAB toolbox for backward decoding and forward encoding modeling of EEG and MEG data (Fahrenfort et al., 2018), replacing the standard time-frequency decomposition algorithm in that toolbox with our custom written time-frequency decomposition. Training and testing was done on the same data, for the Free and Imposed selection condition separately, using a 10-fold cross-validation procedure: first, trials for each of the two conditions were randomized in order, and divided into 10 equal-sized folds; next, a leave-one-out procedure was used on the 10 folds, such that the classifier was trained on 9 folds and tested on the remaining fold, and each fold was used once for testing. Classifier performance was then averaged over folds. Because there were more repeat than stay trials, we balanced the two classes through oversampling, to ensure that during training the classifier would not develop a bias for the overrepresented class (Fahrenfort et al., 2018). The classification performance output metric was the area under the curve (AUC), with the curve being the receiver operating curve of the cumulative probabilities that the classifier assigns to instances as coming from the same class (true-positives) against the cumulative probabilities that the classifier assigns to instances that come from the other class (false-positives). The AUC takes into account the degree of confidence (distance from the decision boundary) that the classifier has about class membership of individual instances, rather than averaging across binary decisions about class membership of individual instances (as happens when computing standard accuracy). As such the AUC is considered a sensitive, nonparametric and criterion-free measure of classification (Hand and Till, 2001). We also inspected the spatial distribution of the classifier weights, through the product of the classifier weights and the original data covariance matrix at each time-frequency point. This procedure results in activation patterns, and are equivalent to the topographical maps of univariate difference between classes (Haufe et al., 2014), although now numerically resulting from a decoding analysis. These maps were further spatially normalized by subtracting the average of the electrodes in the map, and dividing by the SD across electrodes in the map. This yielded a z-score per electrode, reflecting the deviation from average electrode activity across the map expressed in SD. This normalized map was computed per individual subject, after which the z values were averaged across subjects.
Statistical testing.
Statistical analyses were done using group-level permutation testing with cluster correction (Maris and Oostenveld, 2007). For decibel-normalized power, this was done on a switch > repeat contrast, for the Free and Imposed selection conditions separately, and on the double contrast of Free (switch > repeat) > Imposed (switch > repeat). For the multivariate pattern analysis results, this was done on AUC values above chance (0.5) for Free and Imposed selection separately. In all permutation tests, group-level t values were first computed for the above contrasts, and for every time-frequency point. These t values were thresholded at p < 0.05, yielding clusters of significant time-frequency power modulations. The t values in each of these observed clusters were summed. Next, in 2000 iterations, the condition labels (e.g., power values of switch vs repeat, or the AUC value and its 0.5 reference value) were randomized for each subject, before performing t tests on these permuted data. The sum of t values within the largest cluster was saved into a distribution of summed cluster t values. This distribution reflected cluster-level effect-sizes under the null-hypothesis of no effect. Finally, observed clusters with summed t values smaller than the 95th percentile of the null-distribution (corresponding to p ≥ 0.05) were removed. This approach ensures a correction for multiple comparisons by taking into account clusters of spuriously significant time-frequency points that occur purely by chance. Time-frequency cluster tests were done on the average activity of electrodes FC1, FCz, and FC2, which we selected based on previous findings from our laboratory (van Driel et al., 2017). To visualize the spatial distribution of resulting time-frequency clusters, we next averaged over the activity within these clusters, and tested the same trial-type and condition contrasts over all channels, using cluster-correction across space instead of time-frequency, now with a (pre-)cluster-threshold of p < 0.01. To evaluate candidate clusters, we used Fieldtrip's neighbors structure for 64-channel BioSemi layout, and we set 1 channel as the smallest possible cluster. All reported “cluster-corrected” p values in Results refer to proportion of permuted clusters under the null hypothesis that were larger than the observed cluster.
Additionally, we ran parametric paired samples t tests and repeated-measures ANOVAs with factors Target Availability (Free, Imposed) and Trial type (switch, repeat) over highlighted time-frequency windows, using JASP v0.9 (https://jasp-stats.org/). We further tested for cross-subject correlations between brain and behavior measures using the robust percentage-bend correlation metric that de-weights outliers (Pernet et al., 2012).
Results
A planned number of 30 participants performed a gaze-contingent memory-guided visual search task (Fig. 1A). At the start of every sequence of trials, observers were given a cue as to which two target colors to look for. A sequence consisted of 40 consecutive search displays each containing a heterogeneous set of colors, among which either one or two were target colors (depending on condition). Participants were instructed to make an eye movement toward one of the two target colors while avoiding distractors. A new display would then emerge with the current fixation as the starting point. Crucially, in the Free selection condition, both target colors were always present in each search display, thus allowing observers proactive control over which target to prioritize from trial to trial. In the Imposed selection condition, each search display only contained one of the two target colors, and which target would appear was randomly determined (with the same distribution of switches as in the Free selection condition; see Materials and Methods). Thus, here prioritization for the wrong target would require reactive priority reconfiguration. However, if instead observers prioritize both targets equally, no changes in proactive nor reactive control are necessary from trial to trial.
Task design and behavioral results. A, A block began with a fixation cross, and a cue indicating the two target colors for the subsequent sequence of search displays. Depending on the condition each search display contained either one target color (Imposed selection condition; hypothesized to require reactive control on a significant portion of trials) or both target colors (Free selection condition; allowing for efficient proactive control throughout the block). Participants were required to make an eye movement to, and fixate (one of) the (two) target(s), which then triggered the next display. B, Left, Saccadic latency in milliseconds as a function of condition (green, Free selection condition; red, Imposed selection condition) and trial type (open dots, repeat trials; filled dots, switch trials). Each dot shows the trial-average data of a single observer. Horizontal lines show the group average. Gray lines connecting the dots visualize the within-subject difference between repeat and switch trials. Colored asterisks show the within-condition comparison of repeat versus switch trials, thus illustrating switch costs in both conditions. Right, Switch-costs in ms for Free selection (green triangles) and Imposed selection (red triangles) conditions. Gray lines and asterisks show the interaction effect of a stronger switch costs in the Imposed selection than in the Free selection condition. **p < 0.01, ***p < 0.001.
Our results clearly indicate differentially controlled priority states. First, switch costs in saccadic latency were considerably larger in the Imposed than in the Free selection condition (F(1,29) = 14.66, p < 0.001; Fig. 1B). When a change in targets was task-imposed, observers were slower than when targets repeated from one trial to the next (by 61 ms on average; SD = 64; t(29) = 5.20, p < 0.001). In fact, there was also a reliable, though much smaller, switch cost when target selection was free (M = 16 ms; SD = 26; t(29) = 3.36, p = 0.002). These magnitude differences in switch costs are a direct replication of earlier findings (Ort et al., 2017, 2018), and are consistent with, though not conclusive for, a difference in the moment and type of control.
Second, the EEG data reveal clear differential state changes associated with freely initiated versus task-imposed switches. In the Free selection condition, any neural signature reflecting a preparatory mechanism should be apparent before display onset, in the time between the offset of the previous and the onset of the next search display. We moreover hypothesized a potential proactive control mechanism to show a topographical distribution over midfrontal scalp regions, consistent with neuroimaging studies on voluntary and self-initiated behavior (Schuck et al., 2015; Wisniewski et al., 2015). Local oscillatory dynamics have been linked to memory content, motor intentions, and different modes of control (Donner and Siegel, 2011; Helfrich and Knight, 2016). Therefore, we decomposed the EEG data into its time-frequency representation, and compared switch-related activity in three frontocentral electrodes (FC1, FCz and FC2) for Free versus Imposed selection. The Free selection condition showed a robust reduction of power in the beta band (15–35 Hz) for switch relative to repeat trials (cluster-corrected, p < 0.001; Fig. 2A), starting ∼700 ms before the upcoming search display. The effect comprised one sustained time-frequency cluster, reducing in bandwidth around the moment of the saccade, after which the same broadband beta suppression effect re-emerged post-saccade. Importantly, this effect was not present in the Imposed selection condition (p > 0.90). This difference was also apparent from the Free versus Imposed selection contrast, which showed significant switch-related beta-suppression in a time window ∼500 ms before the anticipated onset of the next search display (p = 0.002), followed by poststimulus (and peri-saccade) suppression in the upper-alpha/lower-beta range (10–18 Hz; ∼200–900 ms; p = 0.002). This shows that in a context of free choice, proactively deciding to switch to a different target is supported by relatively suppressed midfrontal beta power.
Local time-frequency power results. A, Time-frequency maps showing t values of statistical tests of switch>repeat on decibel-normalized power, over three midfrontal channels (FC1, FCz, FC2; highlighted in purple in the topographical maps in B, for the Free and Imposed selection condition, and their contrast. Saturated colors show clusters of contiguously significant (p < 0.05) time-frequency points after correcting for cluster sizes under the null hypothesis of no effect (cluster-threshold: p < 0.05). B, Topographical map of t values, averaged over the significant time-frequency clusters shown in A. Purple disks show the three midfrontal channels of the time-frequency map; yellow disks show the extension of the effects shown after testing over all channels and using a cluster-size threshold of p < 0.01. The leftmost map shows the entire beta cluster in the free condition, the map next to it shows the poststimulus theta cluster in the forced condition, whereas the rightmost two topographical maps show the separate prestimulus (pre-stim) and the poststimulus (post-stim) beta clusters in the Free > Imposed contrast. C, Single-subject power (dB) for switch > repeat, averaged within the beta band (left) and theta band (right) time-frequency clusters shown in A. Green dots, Free selection condition; red dots, Imposed selection condition. Horizontal lines show the group-average. Gray lines connecting the dots visualize the within-subject difference between Free and Imposed selection. **p < 0.01, ***p < 0.001. D, Scatter plots showing across-subject the spectral power of switch versus repeat trials, against behavioral switch costs (saccadic-latency difference of switch vs repeat trials), per condition (left and center), and for the condition difference of Free versus Imposed (right). **Significant robust percentage-bend correlation (Pernet et al., 2012), p < 0.01.
To further test the putative involvement of frontal control regions, we evaluated the topographical specificity of these effects. We averaged the activity within the beta-band time-frequency windows, and tested for clusters of channels that would show differences between switch and repeat trials and between conditions. This revealed that the beta-band modulation in the Free selection condition covered prefrontal as well as posterior parietal scalp regions, with a right-hemisphere dominance (p < 0.001; Fig. 2B), whereas there was again no effect in the Imposed selection condition. The prestimulus difference between conditions was localized to a more confined midfrontal-premotor region (p < 0.001). Thus we link the endogenous switching between target representations to modulations of frontoparietal beta oscillations.
In contrast, switches in the Imposed selection condition elicited robust low-frequency (2–8 Hz) delta-to-theta band power enhancements starting ∼250 ms post-display onset, over the three midfrontal channels (Fig. 2A; p = 0.002), which topographically extended toward anterior and lateral prefrontal scalp regions (p < 0.001; Fig. 2B). Similar medial and lateral prefrontal theta-band modulations have been linked to a myriad of conflict- and error-related performance monitoring mechanisms (for review, see Cavanagh and Frank, 2014; Cohen, 2014a). When we tested within the theta-band cluster only, both conditions showed switch-related theta-power enhancement (Free: t(29) = 3.49, p = 0.002; Imposed: t(29) = 5.54, p < 0.001; Fig. 2C), although this effect was reliably stronger in the Imposed selection than in the Free selection condition (F(1,29) = 6.20, p = 0.019), and, in the latter case, did not survive cluster correction when such correction was applied to the entire time-frequency domain (Fig. 2A, left). Apparently, some selection conflict may have occurred even in the Free selection condition. Finally, if the observed conflict signal was instrumental in bringing about slower saccades toward changed targets, this should be reflected in a correlation between theta power and switch costs. Indeed, stronger theta power for switch compared with repeat trials predicted higher switch costs across observers in the Imposed selection condition (robust %-bend correlation: r = 0.53, p = 0.002; Fig. 2D), but not in the Free selection condition (r = 0.17, p = 0.37). Moreover, the condition differences in switch-related frontal theta correlated positively with condition differences in saccadic switch-costs (r = 0.47, p = 0.009). Beta-suppression did not correlate with behavior in either of the two conditions, and whether before or after stimulus onset (all r values <0.15, all p values >0.50).
Together, the above results uncover a clear qualitative dissociation between proactive, preparatory switching reflected in posterior parietal and sensorimotor beta-band suppression, and reactive, conflict-related switching reflected in prefrontal theta-band enhancements. However, these results were obtained by preselecting EEG channels, and after standard trial-averaging methodology. To check whether these selections exhaustively captured all relevant mechanisms, we tested whether we could predict at the single-trial level whether an observer would switch or stay, based on the multivariate power distributions across the scalp for all time and frequency combinations. Linear-discriminant classifiers were trained on single-trial topographic distributions across the entire scalp, in dissociating switch from repeat “classes” across time and frequency (Fahrenfort et al., 2018). The performance of these classifiers was then trained on the same time-frequency points (through a cross-validation procedure; see Materials and Methods). In a time window starting 500 ms before the anticipated search displays in the Free selection condition, a cluster of activity comprising the α-to-beta band (10–30 Hz) indeed predicted whether in the upcoming trial, the saccade was going to be directed toward a different (switch) or same (repeat) target as in the previous trial (p < 0.001; Fig. 3A). This effect reappeared after the saccade (∼500 ms poststimulus), in a slightly lower-frequency range (6–23 Hz; p < 0.001). Classification accuracy in the Imposed selection condition showed a similar broadband poststimulus increase (p < 0.001). However, and crucially, when target selection was task-imposed there was no prestimulus beta activity that was predictive of target switches; instead, the significant poststimulus classification cluster contained relatively stronger modulations in the lower-frequency range of delta-theta (2–8 Hz). We directly compared conditions in these two time windows and frequency bands, and found that, as expected, the Free selection condition showed stronger beta decoding before the search display (t(29) = 5.61, p < 0.001; Fig. 3B), whereas the Imposed selection condition showed stronger theta decoding during search (t(29) = 2.76, p < 0.010; frequency by condition interaction: F(1,29) = 30.55, p < 0.001). The forward-transformed topographies of the classifier weights served as a further validation of our initial channel selection. Such activation patterns are equivalent to the univariate difference between conditions (Haufe et al., 2014; see Materials and Methods). These maps confirmed that prestimulus decoding in beta (10–30 Hz) was reflected in a suppression over midfrontal-premotor channels during Free selection (p = 0.002; Fig. 3C), whereas poststimulus decoding in theta (2–8 Hz) was reflected in a frontoparietal enhancement under Imposed selection (p < 0.001).
Multivariate classification findings. A, Classification accuracy (as measured through the AUC) after training a classifier on scalp-patterns of power at each time-frequency point in the Free (left, green) and Imposed (right, red) selection condition, using single-trial labels of switch and repeat, and testing the classifier weights on the same time-frequency points through cross-validation. Saturated colors show significant clusters of AUC after cluster-size thresholding at p < 0.05. Dashed black/gray rectangles demarcate the time-frequency windows used to display weight topographies in C. B, Classifier accuracy per individual observer, showing direct comparisons between the Free and Imposed selection conditions. AUC values were averaged over parts of the significant time-frequency clusters shown in A, comprising the pre-stimulus (pre-stim) α-to-beta band (10–30 Hz), and the poststimulus (post-stim) delta-to-theta band (2–8 Hz). Each dot shows the accuracy of one individual observer. Horizontal lines show the group-average. Gray lines connecting the dots visualize the within-subject difference between Free selection and Imposed selection. **p < 0.01, ***p < 0.001. C, Forward-transformed activation patterns (Haufe et al., 2014) of AUC within the time-frequency windows demarcated in A. Black-bordered white disks show clusters of significant neighboring electrodes after cluster-size thresholding at p < 0.01. Note, z values express the mean of the individual z-scores; that is, the mean deviation in activity per electrode, as first computed relative to the average electrode activity for each individual subject (see Materials and Methods). D, Time-frequency maps of cross-condition decoding, after training on the Free selection condition and testing on the Imposed selection condition (left) and vice versa (center). Saturated colors highlight a common α-beta cluster late in the trial (∼500–900 ms; p < 0.05) that generalized across the two conditions. Condition-average forward-transformed weights (right) of this common decoding effect showed a parieto-occipital scalp distribution (p < 0.01).
Post-saccade, the two conditions showed a comparable classification response in the α-beta range (Fig. 3A). Could this reflect a similar mechanism after a switch, whether freely chosen or task-imposed, has been initiated? Using cross-condition decoding (King and Dehaene, 2014), we tested whether the classifier weights trained on the data of one condition, could predict above chance whether trials from the other condition were switches or repeats. This analysis showed that the Free and Imposed selection conditions indeed cross-generalized to a common post-saccadic cluster in the alpha/lower-beta band, relatively late into the trial (8–20 Hz, 500–900 ms; p < 0.001; Fig. 3D). This signal may reflect the top-down implemented changes in prioritization, or “reconfigured” working memory content, in both conditions. Indeed, the condition-averaged forward-transformed weights in this time-frequency window showed a parieto-occipital suppression (p < 0.001), consistent with switches in priority states (de Vries et al., 2017, 2018).
Discussion
Our study provides important new insights into the neural mechanisms underlying control of target selection in multiple-target search. First, selecting an alternative target comes with specific neural state changes. Such state changes are not predicted by theories that claim that multiple-target search involves the equal, parallel prioritization of target representations, because a system that is prepared for both targets does not need to change its state. Instead, the data provide strong support for trial-by-trial priority shifts that can be traced through distinct neural signals.
Second, we provide the first neural evidence for dissociable control mechanisms over target selection. These mechanisms depend on whether the environmental context allows for free choice over which target to select, or imposes such targets. We uniquely identified suppression of midfrontal/premotor beta-band (15–35 Hz) oscillatory activity as the signal that precedes free target switches with reduced behavioral cost. Importantly, time-frequency classifiers trained on single instances of beta-band scalp patterns could accurately predict the occurrence of these freely initiated switches more than half a second before they actually happened. We note that although beta suppression was predictive of the occurrence of a switch, it did not predict the magnitude of switch costs in saccadic latencies. This may be the case because these switch costs were very small to begin with and showed relatively little variation, consistent with the idea that observers had sufficient time to prepare for a new target. Together, we therefore interpret beta suppression as the electrophysiological signature of proactive control. Consistent with this, when observers were forced to switch, no prestimulus beta suppression occurred. Instead, here a switch was followed by a transient burst of prefrontal delta/theta band (2–8 Hz) oscillatory power, a signal that has been associated with reactive control, and which was positively correlated with individual differences in behavioral switch costs. We therefore interpret this signal as the electrophysiological signature of reactive control when observers encounter an unanticipated target.
In task switching, proactive control has been linked to various event-related potential components in EEG, as well as fMRI BOLD modulations within a frontoparietal network (Sohn et al., 2000; Rushworth et al., 2002; Braver et al., 2003; Astle et al., 2009; Karayanidis et al., 2010). However, these studies used explicit cues instructing observers to switch stimulus-response mappings. Our findings provide a novel contribution to this field, as we found anticipatory beta suppression to precede an endogenous switch under free choice. This effect was maximal over a region of the scalp covering midfrontal, precentral, as well as posterior parietal regions, with a slightly right-lateralized topographical distribution. Such a sensorimotor network has long been known to exhibit beta-band rhythmogenesis (Pfurtscheller et al., 1996; Baker, 2007).
Moreover, our results are in line with several findings linking beta-band dynamics to various internal decision making processes (Spitzer and Haegens, 2017). For instance, during somatosensory discrimination, beta power in sensorimotor and premotor regions can show categorical responses that reflect accurate perceptual choices (Haegens et al., 2011), which build up over time as sensory evidence accumulates (Donner et al., 2009; Siegel et al., 2011). Integrating these findings, it has been proposed that one underlying role of beta oscillations in these processes is to preserve the current sensorimotor or cognitive state (Engel and Fries, 2010). This can range from the maintenance of working memory content (Spitzer and Haegens, 2017), to the stabilization of a bistable percept in the absence of sensory change (Kloosterman et al., 2015). During such maintenance, beta oscillatory activity (from local modulations to long-range network synchronization; Donner and Siegel, 2011) is often observed to increase in strength. Conversely, beta activity has been proposed to be inversely related to the likelihood of an upcoming voluntary change-of-action, resulting from dopaminergic corticobasal ganglia interactions (Jenkinson and Brown, 2011). Our findings of suppressed beta activity directly fit such a proactive function, here implemented during visual search behavior while holding multiple target objects in working memory. At a more general level they are also consistent with studies on nonhuman primates showing beta band involvement in the top-down maintenance of a search target (Buschman and Miller, 2007, 2009), and the amount of free choice a monkey is allowed in choosing the order in which it fixates multiple targets (Pesaran et al., 2008). Finally, the midfrontal topography of the signal is consistent with fMRI studies implicating the medial frontal cortex in self-initiated behavior (Taylor et al., 2008; Schuck et al., 2015; Wisniewski et al., 2015).
Theta oscillations on the other hand have been proposed to constitute a key mechanism in medial and lateral prefrontal cortex that detects conflict and implements cognitive control (Cavanagh and Frank, 2014), particularly in reacting to an unanticipated mismatch between competing response alternatives of which only one is appropriate (Cohen, 2014a). Here, we witnessed a qualitatively similar prefrontal theta-band modulation triggered by what may analogously be an internal conflict between competing target representations, of which only one can guide selection. Alternatively, task-imposed switching between representations may involve novelty processing, the manipulation of working memory content, or can be considered a form of prediction error, all of which have also been shown to elicit changes in prefrontal theta-band dynamics (Barceló et al., 2006; Cavanagh et al., 2011; Nee et al., 2011; Itthipuripat et al., 2013; Ullsperger et al., 2014). The fact that we observed switch-related theta to extend into lower delta-band frequencies, may link it to prediction-error detection rather than conflict processing per se (Cohen, 2014a). That is, performance errors often arise because of conflict and indeed elicit error-related prefrontal theta increases (van Driel et al., 2012), but uniquely involve delta-band activity as well (Cohen and van Gaal, 2014; Munneke et al., 2015).
Accurately applied cognitive control serves behavioral adjustments, which typically result in slower performance (Kerns et al., 2004; van Driel et al., 2015). Here, the switch-related theta increase indeed predicted the behavioral switch-costs across subjects when switches were enforced. Interestingly, endogenous free switches also showed, in addition to prestimulus beta suppression, a similar but weaker theta increase in the trial-averaged power analysis, which coincided with residual but uncorrelated behavioral switch-costs (Longman et al., 2013; Ort et al., 2018). It is likely that even though under free choice observers proactively prepare to switch, the very presence of both template-matching targets in the search display invokes some competition between them, and thus some form of conflict had to be resolved there too. However, multivariate classifier weights of poststimulus theta activity could not accurately dissociate switch from repeat trials during free choice, rendering it nonetheless a weak signature of reactive control in this condition.
Note that although the current data support the claim that changing target selection is accompanied by changes in the observer's priority state, they do not support the stronger claim that only a single target representation is active, and can thus be actively looked for, at a time (Olivers et al., 2011). The data are also consistent with multiple representations being active in parallel, but with a measurable difference in attentional weights (Found and Müller, 1996; de Vries et al., 2018) or attentional priming (Kristjánsson and Campana, 2010), leading to various degrees of priority of different target representations. What our results indicate is that this differential weighting is disruptive when observers have no choice over which target they will encounter, but can be overcome when the observer is given full proactive control.
To conclude, we provide the first direct support for two modes of control being operative in multiple-target search, which are expressed in widespread qualitative differences in the time-frequency landscape of the EEG signal. Moreover, these signal patterns are predictive of when a switch will occur (under proactive control conditions), or what the switch cost will be (under reactive control conditions). Our study not only bridges different findings within the field of visual search, but also connects concordant ideas in the fields of attention and cognitive control.
Footnotes
This work was supported by Consolidator Grant number ERC-2013-CoG-615423 of the European Research Council to C.N.L.O. We thank Grace Pulsford for assistance in data collection, and Olaf Dimigen for his advice on preprocessing EEG combined with eye tracking.
The authors declare no competing financial interests.
- Correspondence should be addressed to Christian N.L. Olivers at c.n.l.olivers{at}vu.nl