Abstract
Cortical and subcortical networks have been identified that are commonly associated with attention and task engagement, along with theories regarding their functional interaction. However, a link between these systems has not yet been demonstrated in healthy humans, primarily because of data acquisition and analysis limitations. We recorded simultaneous EEG–fMRI while subjects performed auditory and visual oddball tasks and used these data to investigate the BOLD correlates of single-trial EEG variability at latencies spanning the trial. We focused on variability along task-relevant dimensions in the EEG for identical stimuli and then combined auditory and visual data at the subject level to spatially and temporally localize brain regions involved in endogenous attentional modulations. Specifically, we found that anterior cingulate cortex (ACC) correlates strongly with both early and late EEG components, whereas brainstem, right middle frontal gyrus (rMFG), and right orbitofrontal cortex (rOFC) correlate significantly only with late components. By orthogonalizing with respect to event-related activity, we found that variability in insula and temporoparietal junction is reflected in reaction time variability, rOFC and brainstem correlate with residual EEG variability, and ACC and rMFG are significantly correlated with both. To investigate interactions between these correlates of temporally specific EEG variability, we performed dynamic causal modeling (DCM) on the fMRI data. We found strong evidence for reciprocal effective connections between the brainstem and cortical regions. Our results support the adaptive gain theory of locus ceruleus–norepinephrine (LC–NE) function and the proposed functional relationship between the LC–NE system, right-hemisphere ventral attention network, and P300 EEG response.
Introduction
Neural and behavioral responses to external sensory input vary, even for identical stimuli and environmental conditions. Much of this variability is attributable to natural fluctuations in internal attentional state. Some of this internal variation is reflected in behavioral response time (RT), but residual variability remains that cannot be explained by externally observable events. Thus, purely event-related analyses that focus on mean responses to a set of identical stimuli are unable to fully explain the natural waxing and waning of attention.
Evoked responses to sensory stimuli have been studied extensively using EEG. Amplitudes of such event-related potentials (ERPs) modulate with a number of exogenous and endogenous factors, and different components are associated with particular tasks and stimulus types (Key et al., 2005). One well known ERP component, the P300 (Donchin and Coles, 1988; Picton, 1992; Makeig et al., 1999; Hopfinger and West, 2006), appears for both auditory and visual target stimuli, and its amplitude is known to modulate with attention to the task (Luck et al., 2000; Nieuwenhuis et al., 2005; Polich, 2007). Despite extensive study, the mechanism of P300 generation is still not completely understood, although invasive animal studies suggest that it is deeply rooted in the brainstem (BS), involving the locus ceruleus (LC), a tiny nucleus of cells that projects widely and has been found to play a role in target detection, focused attention, and behavioral responses to sensory stimuli (Aston-Jones and Cohen, 2005; Nieuwenhuis et al., 2005; Sara and Bouret, 2012). Based on results of EEG–fMRI studies using auditory or visual tasks (Eichele et al., 2005; Bénar et al., 2007; Walz et al., 2013), the P300 has also been shown to couple with activity in cortical regions, including anterior cingulate cortex (ACC), insula, and the right-lateralized frontal and temporoparietal regions of the ventral attention network (VAN). The common brain regions found using independent auditory and visual studies suggest that their modulatory roles are supramodal. However, to the best of our knowledge, such commonalities across sensory input modalities have not been investigated directly. A strong functional link has been proposed between the LC–norepinephrine (NE) system, the P300 response, and the VAN (Corbetta et al., 2008), but this hypothesis requires validation in healthy humans.
In this study, we use simultaneous EEG–fMRI to investigate non-invasively in healthy humans the BOLD correlates of natural fluctuations of attention that generalize across auditory and visual sensory domains. We use auditory and visual target-detection tasks and apply linear classifiers that discriminate target from standard stimuli. We use the classifier output to index attention to the task on a single-trial basis, correlating this variability with the BOLD data via the general linear model (GLM). We use the results as priors in a dynamic causal model analysis (Friston et al., 2003; Stephan et al., 2010). Our findings provide evidence for the role of the LC–NE system in supramodal attention-related modulation of evoked responses and support the hypothesized functional link between the LC–NE system, P300, and VAN.
Materials and Methods
The majority of these methods have been applied and described previously (Goldman et al., 2009; Sajda et al., 2010; Walz et al., 2013) and are reproduced here for the sake of completeness and ease of reading.
Behavioral paradigm.
Seventeen subjects (six females; mean of 27.7 years; range, 20–40 years) participated in three runs each of analogous visual and auditory oddball paradigms. The 375 (125 per run) total stimuli per task were presented for 200 ms each with a 2–3 s uniformly distributed variable intertrial interval and target probability 0.2. The first two stimuli of each run were constrained to be standards. For the visual task, the target and standard stimuli were, respectively, a large red circle and a small green circle on isoluminant gray backgrounds (3.45° and 1.15° visual angles). For the auditory task, the standard stimulus was a 390 Hz pure tone, which was selected to lie within a trough of the scanner sound frequency spectrum, and the target sound was broadband. This broadband “laser gun” sound was chosen such that EEG discriminator performance matched that of the visual task. Because our study focused on task-related attentional states, subjects were asked to respond to target stimuli, using a button press with the right index finger on an MR-compatible button response pad. Stimuli were presented to subjects using E-Prime software (Psychology Software Tools) and a VisuaStim Digital System (Resonance Technology), comprising headphones and 600 × 800 goggle display. All subjects gave informed consent following the protocol of the Columbia University Institutional Review Board.
Simultaneous EEG and fMRI data acquisition.
A 3 T Philips Achieva MRI scanner (Philips Medical Systems) was used to collect functional echo-planar image (EPI) data with 3 mm in-plane resolution and 4 mm slice thickness. We covered the entire brain by obtaining 32 slices of 64 × 64 voxels using a 2000 ms repetition time and 25 ms echo time. We also acquired a single-volume high-resolution (2 × 2 × 2 mm) EPI image and a 1 × 1 × 1 mm spoiled gradient recalled image for each subject for purposes of registration.
We simultaneously and continuously recorded EEG using a custom-built MR-compatible EEG system (Goldman et al., 2009; Sajda et al., 2010; Walz et al., 2013), with differential amplifier and bipolar EEG cap. The caps were configured with 36 Ag/AgCl electrodes including left and right mastoids, arranged as 43 bipolar pairs. Bipolar pair leads were twisted to minimize inductive pickup from the magnetic gradient pulses and subject head motion in the main magnetic field. This oversampling of electrodes ensured data from a complete set of electrodes even in instances when discarding noisy channels was necessary. To enable removal of gradient artifacts in our offline preprocessing, we synchronized the 1-kHz-sampled EEG with the scanner clock by sending a transistor–transistor logic pulse to a field-programmable gate array card (National Instruments) at the start of each of 170 functional image acquisitions. All electrode impedances were kept below 20 kΩ, including 10 kΩ resistors built into each electrode for subject safety. A comprehensive description of the hardware, along with the preprocessing and analysis methods described throughout the remainder of Materials and Methods, can be found in the book chapter by Sajda et al. (2010).
EEG data preprocessing.
We performed standard EEG preprocessing offline using MATLAB (MathWorks) with the following digital Butterworth filters: 1 Hz high pass to remove direct current drift, 60 and 120 Hz notches to remove electrical line noise and its first harmonic, and 100 Hz low pass to remove high-frequency artifacts not associated with neurophysiological processes. These filters were applied together in the form of a linear-phase finite impulse response filter to avoid distortions caused by phase delays.
Ballistocardiogram (BCG) artifacts are more challenging to remove, because they share frequency content with EEG activity. Existing BCG removal algorithms cause loss of signal power in the underlying EEG, so we performed single-trial classification (described in Materials and Methods, Single-trial analysis of EEG) on the data before BCG artifact removal. This is a justified practice because our classifier identifies discriminating components that are likely to be orthogonal to BCG. To compute scalp topographies of these discriminating components, BCG artifacts were removed from the continuous gradient-free data using a principal components analysis method (Goldman et al., 2009; Sajda et al., 2010). First, the data were low passed at 4 Hz to extract the signal within the frequency range in which BCG artifacts are observed and then the first two principal components were determined. The channel weightings corresponding to those components were projected onto the broadband data and subtracted out. These BCG-free data were then re-referenced from the 43 bipolar channels to the 34-electrode space to calculate scalp topographies of EEG discriminating components (described in Materials and Methods, Single-trial analysis of EEG).
We extracted 1000 ms stimulus-locked epochs, with baseline removal on the 200 ms before stimulus, from both the 43-channel datasets and the 34-electrode re-referenced dataset. By visual inspection, we discarded trials containing motion or blink artifacts, evidenced by sudden high-amplitude deflections, and also those with incorrect responses, identically for both datasets. This left ∼95% of the target trials remaining.
Single-trial analysis of EEG.
Independently for the auditory and visual data, we applied a linear classifier to the 43-channel EEG signal amplitude using the sliding window method of Parra et al. (2005). Specifically, we found a projection of the multidimensional EEG signal, xi(t), where i = {1… T} and T is the total number of trials, within a short time window that achieves maximal discrimination between target and standard trials. All time windows had a width of N = 50 ms, and the window center, τ, was shifted from 0 to 1000 ms relative to stimulus onset, in 25 ms increments. We used logistic regression to learn the 43-channel spatial weighting, w(τ), that maximally discriminated conditions, arriving at the projection, yi(τ), for each trial i and a given window τ.
Note that we use the average projection in each temporal window, τ, which tends to be less sensitive to noise (Parra et al., 2005). The result is a hyperplane, for each τ, that maximally discriminates the target versus standard trials given the EEG amplitudes for each trial (Fig. 1A).
Model design. The entire method was applied independently for multiple temporal windows, τ, of EEG data spanning the trial. Shown is an example for a given τ. A, We first estimate w, which is a linear weighting on the EEG sensors that maximally discriminates stimulus conditions: targets (red) vs standards (green). This determines a task-relevant projection of the data, in which the distance to the decision boundary reflects the decision certainty of the classifier. B, From w, we compute ȳi, which is the demeaned classifier output, ȳi = wT xi − ymean for each target trial, i. We treat ȳi as an index of attentional state on each target trial. Note that ȳi contains no discriminating information about target versus standard trials and instead represents the STV of the task-relevant EEG projection around the mean of all targets. C, Given the ȳi values and their corresponding stimulus-onset time points, we build regressors for an fMRI GLM, including event-related average BOLD response to targets (D), RT variability (E), and residual EEG variability (F). Similar regressors were included for the standard stimuli, along with motion parameters and temporal derivatives. All were convolved with the canonical HRF.
The classifier output, yi(τ), represents the confidence of the classifier in its prediction based on the training data and the model. We assessed classifier performance with the area under the receiver operating characteristic curve (Green and Swets, 1966), denoted AUC, using leave-one-out (LOO) cross-validation (Duda et al., 2000). AUC was calculated for multiple temporal windows, enabling observation of the temporal progression of task-relevant components and localization of the event-locked time with maximal discrimination between conditions. To obtain a significance threshold for the AUC values, we used a permutation test in which we randomly permuted the trial labels and ran the classifier using the LOO approach. We repeated this procedure 1000 times for each subject to generate a distribution of AUC values from which we computed the null hypothesis distribution for AUC values and the corresponding AUC threshold for p = 0.01. For each window τ, we also generated the forward model a(τ),
where yi(τ) is now organized as a vector y(τ), where each row is from trial i, and xi(t) is organized as a matrix, X(τ), where rows are channels and columns are trials, all for time window τ. These forward models can be viewed as scalp plots and interpreted as the coupling between the discriminating components and the observed EEG (Parra et al., 2005; Goldman et al., 2009; Sajda et al., 2010; Walz et al., 2013). Because we estimate a linear weighting for temporal windows spanning the trial, we are able to track the progression of the spatial components across time.
fMRI data preprocessing.
Using FSL (FMRIB Software Library; Smith et al., 2004), we performed bias-field correction on all images to adjust for field distortion artifacts caused by the EEG wires. We then performed slice-timing correction, motion correction, 0.01 Hz high-pass filtering, and 5 mm full-width half-maximum spatial smoothing on the functional data. Motion correction provided motion parameters, which were later included as confounds in the GLM. Functional and structural images were registered to a standard Montreal Neurological Institute (MNI) brain template after brain extraction, and each subject's image registration was checked manually to ensure proper alignment.
Traditional fMRI analysis.
We first ran a traditional fMRI analysis, using event-related and RT variability regressors in our GLM. The event-related regressors comprised boxcar functions with unit amplitude and onset and offset matching that of the stimuli. RT variability was modeled using unit amplitude boxcars with onset at stimulus time and offset at RT, and these were orthogonalized to the event-related regressors. Orthogonalization was implemented using FSL, which uses the Gram–Schmidt procedure (Strang, 2003) to decorrelate the RT regressor from all other event-related regressors. All regressors were convolved with the canonical hemodynamic response function (HRF), and temporal derivatives were included as confounds of no interest. An event-related targets versus standards contrast was also constructed. A fixed-effects model was used to model activations across runs, and a mixed-effects approach was used to compute the contrasts across subjects.
We combined individual runs of the experiment at the subject level in three ways: (1) including only auditory runs; (2) including only visual runs; and (3) including all six auditory and visual runs together. Each of these three analyses was carried through to the group level. Statistical image results for these traditional analyses were thresholded at z > 2.3, and clusters were multiple-comparison corrected at p < 0.05 (Worsley, 2002). Because it is possible for significant correlations to be driven much more strongly by a subset of the data (in this case by one task), we used the subject-level auditory-only and visual-only results to compute paired t tests and thus identify any brain regions with BOLD correlates that differed significantly across auditory and visual runs. We discarded any voxels that achieved an uncorrected voxelwise p < 0.01 from the statistical maps of the combined analysis to ensure that our reported findings are common to both auditory and visual tasks.
Because pulsation artifacts in the ventricles have been attributed to false activations near the BS (Astafiev et al., 2010), we similarly excluded any voxels that were correlated with the BCG pulsation artifact present in the EEG. These voxels were determined with a p < 0.01 uncorrected threshold, after incorporating the pulse timing into a GLM and carrying through to group level. The pulse timing was obtained using a peak detection algorithm on the EEG signals at temporal electrodes (T7, T8, CP5, CP6), which are the sites most strongly affected by such artifacts.
EEG-based fMRI analysis.
For the single-trial variability (STV) fMRI analysis, we modeled the variability of the neural response using an additional two regressors: one each for targets and standards. These EEG-based regressors were designed with 100 ms duration, centered on the classifier training window. The STV regressor height was modulated using the de-meaned output, ȳi = yi − ymean, of the EEG discriminator for each trial, i (Fig. 1). These regressors were convolved with the HRF and orthogonalized with respect to all traditional regressors, with temporal derivatives included as confounds. It was especially important to regress out the RT variability, because RT is known to be negatively correlated with attention and our aim was to study variability in task-related attention that cannot be detected using an external measure. This entire analysis was run independently for all EEG training windows exceeding a mean AUC value of 0.75 for both tasks, which is a common psychophysical threshold used in signal detection theory and highly conservative with regard to EEG discrimination. We focused on within-class variability, using only the task-relevant (target) stimuli STV statistical maps in our results interpretation.
We used the following randomization method to generate a null distribution and correct for multiple comparisons. We ran the full STV fMRI analysis described above after permuting the yτ = 450 ms values randomly within each class. One hundred such permutations were done for each subject, from which we generated 10,000 group-level statistical maps via random samplings from the subject-level results. Stelzer et al. (2013) demonstrated the validity of this oversampling method. This null distribution was used to generate familywise error (FWE)-corrected positive and negative activation p value maps within the BS (one map for each of the EEG windows).
Cluster-size thresholding is commonly used to correct for multiple comparisons in cortical regions, because true activations are assumed to be widespread and FWE correction is often too strict for such a large number of voxels. To avoid an arbitrarily selected cluster-size threshold in our cortical analysis, we applied threshold-free cluster enhancement (TFCE) to account for variable z-score magnitude within each cluster (Smith and Nichols, 2009). After also applying TFCE to all maps comprising the null distribution, we used FWE correction within a gray matter mask. All corrected maps were thresholded at p = 0.05.
As in the traditional analysis, we ran the EEG STV fMRI analysis separately three ways: (1) only auditory runs; (2) only visual runs; and (3) with auditory and visual combined at the subject level. The latter method was of primary interest, because it was used to investigate BOLD correlates that generalize across both sensory modalities. As described for the traditional fMRI analysis (see below, Traditional fMRI analysis), we used the auditory-only and visual-only results to determine voxels with activation that significantly differed between the two tasks. These were then excluded from the statistical maps of the combined analysis, as were any voxels that correlated with BCG pulse timing. No more than 5% of the initial voxels were eliminated. This ensured that our reported findings generalize across the auditory and visual domains.
Dynamic causal modeling of fMRI data.
The EEG-based fMRI analysis described above provided timing information about the BOLD correlates of endogenous attentional modulations, but it did not allow inference of directed relationships between these regional activations. We investigated interactions between our temporally specific EEG variability correlates by performing an effective connectivity fMRI analysis. In particular, we implemented single-state linear dynamic causal modeling (DCM; Friston et al., 2003; Stephan et al., 2010) using DCM10 in SPM8.
We used the results of our EEG-based fMRI analysis (see Fig. 6) to select five regions of interest (ROIs); these were chosen based on suprathreshold significance and particular interest in relation to previous studies of attentional networks: (1) BS; (2) ACC; (3) right ACC (rACC); (4) right middle frontal gyrus (rMFG); and (5) right orbitofrontal cortex (rOFC).
The activations were spread across four different EEG time windows (i.e., correlated with variability in temporally specific EEG discriminating components), and some differed in sign of correlation so their time series were not necessarily intrinsically correlated. Each ROI was a sphere with 4 mm radius, centered on the peak voxel of the group-level EEG-based GLM results. Time series were extracted from individual subjects' preprocessed functional data in MNI space (downsampled to 3 × 3 × 3 mm for computational efficiency) by estimation of the first principal component within each ROI. Auditory and visual runs of the experiment were combined because the focus of our study was task-related attentional modulations common across sensory input modalities.
We designed 12 related models (see Fig. 8) to investigate intrinsic directed connectivity among the five ROIs, focusing on cortical projections to and from the BS. In Models 1–9, we varied the direction of the ACC–BS connection (including one-way and bidirectional connections) and the stimulus input region (BS, ACC, or both). Directionality of the other connections was based on known structural projections in primates (Aston-Jones and Cohen, 2005), including a directed connection from the BS to rMFG and bidirectional connections between BS and rACC, as well as BS and rOFC. Models 10–12 used timing information from the EEG-based analysis in specification of all the connections, only allowing connections to be directed forward in time. Input region was varied similarly to the other models.
We used fixed-effects Bayesian model selection (BMS) to compare these 12 models both on a single-subject level and at the group level. BMS balances model fit and complexity, thereby selecting the most generalizable model. It estimates the relative model evidence and provides a distribution of posterior probabilities for all of the models considered. We used fixed effects because focused attention to a simple task involves basic physiological processes that are expected to be common to a healthy subject population (Stephan et al., 2009, 2010). We also compared families of similar models (Penny et al., 2010); the model space was divided into three families based on the direction of connectivity between the BS and ACC. Finally, we used Bayesian parameter averaging (BPA) to provide group-mean intrinsic connection strengths and their probabilities for the winning model.
Results
All subjects responded with high accuracy and speed. For the auditory task, 98.3 ± 2.0% of targets were correctly detected, with 404.1 ± 58.3 ms RT. The visual targets were detected with 98.4 ± 3.1% accuracy and 397.2 ± 38.9 ms RT.
EEG analyses
Traditional ERPs after target stimuli displayed strong widespread P300 responses for both the auditory and visual tasks. These were most prominent and discriminative relative to the ERP response to standard stimuli at the Pz electrode site (Fig. 2). N200 responses were also strongest and most discriminative at parietal sites for both tasks. The N100 and P200 components were largest at medial frontal electrodes for the auditory task. Both of these earlier ERP components could also be seen for the visual task but were stronger at parietal sites. The latencies and scalp topographies were consistent with previous ERP literature (Makeig et al., 1999; Key et al., 2005; Hopfinger and West, 2006), notably that the auditory N100–P200 complex presents over frontal regions, whereas the visual N100–P200 presents more strongly over parietal regions.
Traditional stimulus-locked ERPs. Target and standard EEG responses at the Fz (top) and Pz (bottom) electrode sites for both the auditory (left) and visual (right) oddball tasks. Time is in milliseconds relative to stimulus onset.
For both the auditory and visual tasks and for all subjects, we were able to discriminate target versus standard EEG trials with highly significant accuracy for multiple EEG windows (Fig. 3). Our permutation method (see Materials and Methods, Single-trial analysis of EEG) determined that the p = 0.01 significance corresponded to AUC = 0.66. The auditory stimulus discrimination resulted in a peak group-mean AUC = 0.84 at 225 ms, with a later discrimination peak of AUC = 0.81 at 375 ms. The visual stimulus discrimination curve displayed a broad peak with maximum AUC = 0.86 at 325 ms. Discriminator performance exceeded the more conservative AUC > 0.75 threshold for the 175–475 ms windows for the auditory task and 175–600 ms windows for the visual task. Thus, all windows from 175 to 475 ms were included in the fMRI analysis. Importantly, the high discrimination accuracy with similar temporal profiles showed the auditory and visual tasks to be well matched.
Single-trial EEG discrimination results. Group mean discriminator performance for the auditory (left) and visual (right) oddball tasks. Standard error is indicated with shading. Because we are interested in the BOLD correlates of single-trial EEG variability, we only consider EEG components with discrimination that is both significant (AUC > 0.66, p < 0.01) and substantial (AUC > 0.75, p ≪ 0.01). Corresponding EEG discriminant component scalp projections (forward models) are shown for a subset of the windows with significant classifier performance.
The grand mean forward models, which show the scalp distribution of discriminating components as they progress across time, are displayed below the discriminator performance curves in Figure 3 for a subset of windows with significant performance. Both auditory and visual EEG data were most discriminative at medial frontal sites very early in the trial and then at medial central sites slightly later, and finally the P300 range data were most discriminative at medial posterior sites. These topographies were consistent with previous reports for auditory (Goldman et al., 2009) and visual (Gerson et al., 2005) target discrimination. Discriminator output was significantly (p < 0.01) negatively correlated with RT for multiple EEG windows. This result demonstrated the need to orthogonalize our STV fMRI regressors to RT-variability regressors given our aim to study residual variance unobservable with behavioral response.
Traditional fMRI analysis
The traditional event-related target versus standard contrast resulted in extensive activations throughout multiple cortical and subcortical structures. Bilateral activations were observed in cerebellum, thalamus, insula, ACC, and supramarginal gyri. Left (i.e., contralateral) motor areas and rMFG were also strongly activated. These were consistent with previous oddball paradigm results (Stevens et al., 2000; Laurens et al., 2005). RT-variability statistical maps showed strong correlates in ACC, posterior cingulate cortex, left motor areas, rMFG, right lateral occipital cortex, right angular gyrus, and bilateral insula (Fig. 4).
fMRI BOLD correlates of RT variability. Group-level positive activations thresholded at z > 2.3 and cluster corrected at p < 0.05. All voxels correlating significantly (p < 0.01 uncorrected) with the BCG pulse timing are excluded. Also excluded are all voxels with significantly different activation strength for the auditory versus visual tasks (paired t test, p < 0.01 uncorrected).
Single-trial EEG variability fMRI analysis
The EEG-derived regressors resulted in significant group-level activations for multiple stimulus-locked EEG training windows, including positive correlates for the 200 ms window and negative correlates for the 425–475 ms windows. The early (200 ms) positive activations were observed in ACC and left caudate. EEG variability within all windows from 425 to 475 ms correlated negatively with BOLD activity across rMFG/right inferior frontal gyrus (rIFG). Negative correlates were also found in right precentral and postcentral gyri for the 425 and 475 ms windows. Right-lateralized negative activations were also detected in paracingulate/ACC at 425 ms and temporal and frontal poles, including rOFC at 475 ms. A negative 21-voxel activation in the BS was observed for the 450 ms window (Fig. 5); this cluster had a minimum corrected p value of 0.0018 at MNI coordinates (0, −28, −20). Table 1 contains a complete list of activations exceeding multiple-comparison-corrected p = 0.01 with their corresponding EEG windows. Figure 6 shows axial slices through the peak voxel of each of these clusters. Despite the high EEG discrimination accuracy in the middle window range, we did not detect significant activations at those latencies (225–400 ms) in this supramodal analysis, which used a paired t test to exclude areas that were more strongly correlated with EEG during one task (either auditory or visual).
BOLD fMRI correlate of EEG STV detected in the BS. EEG discriminating component variability in the 450 ms window correlated negatively with a 21-voxel cluster in the midbrain. Minimum FWE-corrected p value of 0.0018 was located at MNI coordinates (0, −28, −20).
BOLD correlates of EEG variability common to auditory and visual tasks
BOLD fMRI correlates of EEG STV spanning the trial. Timing diagram showing axial slice through peak voxel of each significant cluster. Time is in milliseconds relative to stimulus onset. Red–yellow denotes positive correlation with the single-trial EEG classifier output (attentional index), and blue–cyan denotes negative correlation. MNI z-coordinate is specified to lower right of each slice.
In addition to the stimulus-locked analysis, we investigated response-locked activity, because some cognitive processes and subcomponents of the P300 are more tightly time locked to behavioral responses than stimulus presentation (Makeig et al., 2004; Gerson et al., 2005; O'Connell et al., 2012). Although discrimination performance for response-locked EEG windows exceeded that of the stimulus-locked analysis, we were unable to detect significant BOLD correlates of response-locked discriminating activity common to both auditory and visual tasks.
Separating BOLD correlates of RT variability and latent EEG variability
We separated variance in the BOLD data explainable purely by RT variability from that explainable by single-trial EEG variability. Figure 7 summarizes these results. Specifically, we find strong bilateral insula correlates of RT variability but no significant correlates of the EEG variability in these regions. We also found that right angular gyrus/temporoparietal junction correlated with RT variability but not with single-trial EEG variability. Conversely, other regions that have been associated with P300 correlated significantly with EEG variability but not RT. This included right precentral and postcentral gyri and left caudate. Last, we find that the ACC and rMFG/rIFG regions correlate significantly with both RT variability and the residual EEG discriminating component variability, suggesting that some but not all of ACC and rMFG/rIFG BOLD response variation is observable in the RT variation.
Reflection of neural activity in RT is region specific. By regressing out RT variability from EEG variability in our fMRI model, we find that single-trial BOLD activity in a subset of attention- or P300-linked regions is reflected in the behavioral response. Listed are regions in which BOLD signal significantly (p < 0.05 corrected) correlates with RT STV, latent early-window EEG STV, the late-window EEG variability, or their intersections. Common associations within the literature are indicated. LOC, Lateral occipital cortex; PCC, posterior cingulate cortex; preCnt, precentral; postCnt, postcentral; TPJ, temporoparietal junction.
Dynamic causal model selection and parameter estimation
Across the 17 individual subjects and models examined (Fig. 8), Model 4 (Fig. 9) was the dynamic causal model chosen most frequently by the BMS method (five wins), with Model 5 in second place (three wins). Both of these models were members of the bidirectional model family, which was defined based on the presence of a bidirectional connection between the ACC and BS. All other connectivity was specified identically for these two models (with drives from BS to all other ROIs and connections directed toward the BS from ACC, rACC, and rOFC). The only difference was the input region for the stimulus events, which was the BS for Model 4 and the ACC for Model 5.
Dynamic causal models. We designed, estimated, and compared 12 related models. Five ROIs were selected from results of the EEG-based fMRI analysis and therefore were correlated with temporally specific EEG discriminating component variability. Red circles represent ROIs positively correlated with the EEG variability, and blue circles indicate negative. Diagram is arranged such that these EEG correlates are in temporal order from left to right. For cluster details, see Table 1 and Figure 6. Event timing for target (ut) and standard (us) stimuli were used as input to the dynamic causal models, applied to ACC, BS, or both. Model space was partitioned for family analysis based on ACC–BS connectivity. ACC to BS: Models 1, 2, 7, 10, 11, and 12; BS to ACC: Models 3, 6, and 8; bidirectional connectivity: Models 4, 5, and 9. The optimal model is noted with a gray box.
Group average DCM for the optimal model. Results of BPM for Model 4 (see Fig. 7), which won the BMS on the group level (posterior probability of 0.9997) and had more wins than any other model on the individual subject level. Intrinsic connection strengths are displayed next to their corresponding connectivity arrows; all values had posterior probability exceeding 0.9975. Time axis and circle color relate the model to the EEG-based fMRI results (see Fig. 6), from which these ROIs were derived.
On the group level, BMS chose Model 4 as an overwhelming winner, with relative posterior probability of 0.9997. Model 5 placed a distant second with relative probability of 0.0003. When BMS was applied to the partitioned model space, the bidirectional family won with a relative posterior probability of 1.0000 on the group level. (The other two partitions, which were defined by unidirectional ACC–BS connections, were each estimated to have relative probabilities <10−9.)
All group-mean coupling and input parameters estimated by BPA of Model 4 (Fig. 9) achieved a relative posterior probability >0.9975. Connectivity internal to each of the four cortical ROIs was negative (representative of self-inhibition), whereas BS self-connectivity had a low-magnitude positive coupling parameter estimate. Connectivity parameters from the BS directed to all ROIs were positive (excitatory), ranging from 0.56 to 1.04 Hz. Strongest coupling strength was estimated for the “earliest” ROI (i.e., derived from correlates of early EEG variability) and monotonically decreased to the lowest strength for the “latest” ROI. The ACC and rOFC connections directed toward the BS had negative (inhibitory) coupling parameters (−0.94 and −0.43 Hz, respectively), and the rACC showed a positive value (0.69 Hz).
Discussion
fMRI correlate of EEG variability in BS supports LC adaptive gain theory
The LC–NE system is commonly associated with optimization of task performance by engaging and refocusing attention (Aston-Jones and Cohen, 2005; Bouret and Sara, 2005; Dayan and Yu, 2006; Sara and Bouret, 2012). Current theories of its specific functional role rely heavily on animal studies, with confirmation in humans needed. BS imaging using fMRI poses a number of challenges, notably close proximity to pulsatile vessels, proper image alignment, and the small size of BS nuclei relative to BOLD spatial resolution. Nevertheless, fMRI activations in midbrain nuclei have been reported previously (Sterpenich et al., 2006; D'Ardenne et al., 2008; Eichele et al., 2008; Payzan-LeNestour et al., 2013), including EEG–fMRI evidence for an effective relationship between ACC and BS during a simple auditory target-detection task (Crottaz-Herbette and Menon, 2006). In the auditory oddball EEG–fMRI study of Eichele et al. (2008), a temporally independent component of the EEG had single-trial amplitude that covaried negatively with a cluster in the midbrain and positively with an ACC cluster; this particular component had central scalp topography, contained a strong P3a response, and modulated with arousal and target salience. Our BS correlate of the 450 ms EEG variability lies between and in close proximity to LC (Keren et al., 2009; Astafiev et al., 2010) and ventral tegmental area (VTA; Carter et al., 2009). Because the LC strongly innervates other nearby neuromodulatory midbrain nuclei and receives their projections to a lesser extent (Sara and Bouret, 2012), our cluster of activation likely arises from LC activity. The LC also projects to multiple subcortical structures and broadly across the cortex (Loughlin et al., 1986), but its only cortical inputs come from the ACC and OFC, which receive information from sensory cortices. These prefrontal regions are thought to regulate LC activity both by directly driving its phasic responses and via changes to baseline excitatory drive (Aston-Jones and Cohen, 2005).
Our methods revealed BOLD correlates of EEG variability in ACC for the 200 and 425 ms windows, consistent with multiple reports of ACC coupling with various components of the evoked EEG response. In previous EEG–fMRI studies, the ACC was shown to couple with the N100 component during auditory tasks (Mulert et al., 2008; Esposito et al., 2009), and fMRI-constrained ERP source modeling suggested that it is the major generator of the N2b and P3a components during both auditory and visual tasks (Crottaz-Herbette and Menon, 2006). Note that our early ACC cluster is not representative of sensory-evoked activity that differs between targets and standards; we do not investigate the discriminating projection of the EEG data itself but instead the BOLD correlates of STV along that projection. We also only interpret results within the target class to avoid stimulus-type confounds.
As part of the adaptive gain theory, modulatory drive from the ACC and OFC to the LC is proposed to result from outcome of decision processes (Aston-Jones and Cohen, 2005), an idea that stems from known latency of the LC and P300 responses and evidence to suggest that LC activity is more tightly locked to RT than stimulus presentation (Clayton et al., 2004). Therefore, we might expect to find significant correlates of response-locked EEG variability (which we did not detect here common to auditory and visual tasks). However, because variability in response-locked EEG components is highly correlated with RT variability (O'Connell et al., 2012), orthogonalization of the EEG-based regressors to RT regressors likely left too little residual variance for response-locked EEG correlates to achieve significance.
Our late correlates in rACC, BS, and rOFC all occur in the range of the P300 (respectively at 425, 450, and 475 ms) and are all coupled with modulations of discriminating components closely related to the P300 ERP component (see scalp distribution of discriminating components in Fig. 3). Their latency supports the decision outcome hypothesis of the adaptive gain theory and provides support for the role of ACC and OFC in direct modulation of the LC phasic response. Furthermore, they show that modulatory activity in ACC occurs before that in rOFC. The late ACC activation is also consistent with previous reports that link it to P300 modulations (Bledowski et al., 2004b; Linden, 2005; Crottaz-Herbette and Menon, 2006; Bénar et al., 2007); OFC is not commonly associated with ERP variability. Note, however, that our early ACC correlate suggests that it also has a role in tonic (i.e., baseline) modulation, which occurs before the decision. Our inability to detect correlates for middle EEG windows suggests that, in the middle time range, EEG and BOLD are coupled only in regions specific to the sensory input modality. Although we can only speculate about the lack of detected correlates for middle EEG windows, spatial and temporal variability across subjects is a likely contributing factor. Additionally, in the middle time range, EEG and BOLD may be coupled only in regions specific to the sensory input modality.
According to the adaptive gain theory of LC–NE function, behavioral performance is optimal during states of intermediate tonic activity, which result in a maximal phasic response to sensory stimuli. Changes in tonic activity drive transitions between high and low attentional states, in which very low or high activity results in inattentive or distractible states, respectively. The phasic LC response only occurs for task-relevant stimuli but is not specific to sensory modality (Aston-Jones et al., 1994).
Because pupil diameter is closely tracked by tonic LC activity (Aston-Jones and Cohen, 2005) and pupil size to single-trial EEG variability (Murphy et al., 2011), we hypothesize that our midbrain correlates of EEG variability reflect underlying LC tonic activity and subsequent LC phasic response and the P300. The concurrent EEG–pupillometry study of Murphy et al. (2011) found a relationship between baseline pupil diameter and variability of the P300 EEG response to auditory targets; that finding was also interpreted as support for the adaptive gain theory. Similar studies using simultaneous EEG, pupillometry, and fMRI would be of great benefit to this field.
There are limited reports of caudate coupling with P300 in both auditory (Crottaz-Herbette and Menon, 2006) and visual (Warbrick et al., 2009) tasks. Because the caudate is highly innervated by dopaminergic neurons (mainly from the VTA and substantia nigra within the midbrain), this cluster provides additional support for the adaptive gain theory, which proposes that the LC–NE system interacts with many other brain circuits and works in synergy with the dopaminergic system (Aston-Jones and Cohen, 2005).
Right-hemisphere cortical activations provide a link between the LC–NE system, VAN, and P300
The LC phasic response has also been theorized to drive transitions that refocus attention in a more complex way, by facilitating an interruption (Dayan and Yu, 2006) and reset (Bouret and Sara, 2005) of the current endogenously driven target network and the LC tonic activity, thus reorienting attention to an updated task. This switch is facilitated by the right-lateralized VAN, which is activated by salient behaviorally relevant stimuli (Corbetta and Shulman, 2002; Bouret and Sara, 2005; Fox et al., 2006; Corbetta et al., 2008). Because the VAN, P300 potentials, and LC phasic responses all similarly show enhanced responses to behaviorally relevant stimuli in multiple sensory modalities, a functional link between them has been proposed, in relation to both tonic transitions of attention and phasic responses (Corbetta et al., 2008).
We found correlates of P300-range EEG variability in the rMFG/rIFG, a main node in the VAN. This region has been implicated previously in P300 modulations (Bledowski et al., 2004a,b; Eichele et al., 2005; Bénar et al., 2007), as were the activations we detected in right precentral gyrus (Bledowski et al., 2004b; Eichele et al., 2005; Crottaz-Herbette and Menon, 2006) and postcentral gyrus (Eichele et al., 2005; Crottaz-Herbette and Menon, 2006) for the same windows. Nearly all our significant correlates of EEG STV were right lateralized, consistent with the VAN and other reports showing that the majority of regions coupling with EEG responses are in the right hemisphere (Eichele et al., 2005). Bilateral precentral gyrus is commonly associated with a more goal-driven and dorsal attention network, but the right precentral gyrus has also been linked to exogenous processing (Kincade et al., 2005; Corbetta et al., 2008).
Our concurrent BS, ACC, rOFC, and rMFG/rIFG activations, which all occur in the P300 range, provide evidence for a strong link between the LC–NE neuromodulatory system, VAN, and the P300. They support the hypothesis that VAN suppression during focused attention is partly attributable to decreased tonic activity of the LC–NE system (Corbetta et al., 2008).
Recurrent interactions underlie task-related attentional modulations
ROIs for fMRI DCM analyses are commonly selected based on peaks within traditional event-related GLM statistical maps. One obvious approach to incorporate single-trial EEG variability into a DCM analysis is to build a bilinear model that includes single-trial EEG measurements as modulators to connectivity strength. However, two issues arise with this approach. First, voxels selected by a traditional GLM contrast are not necessarily affected by the trial-to-trial variability of interest and may instead have variability dominated by noise (Smith et al., 2012). Second, the trial-to-trial variability of the EEG response differs across the temporally specific components of the response. One EEG component may be of particular interest for certain task-specific studies (e.g., the face perception study by Nguyen et al., 2013), but selection of a single component is insufficient to study modulations of attention that generalize across simple tasks in multiple sensory domains. To circumvent these issues, we used the results of our EEG-based fMRI analysis to define ROIs for our DCM analysis. We selected a subset of our suprathreshold clusters that were of particular interest based on previous literature. These included the ACC correlate of early (200 ms) EEG variability, a non-spatially overlapping rACC P300-range correlate, other cortical P300-range correlates in rMFG and rOFC, and the BS cluster. STV within these ROIs was thus meaningful for the effective connectivity we aimed to study.
Our DCM analysis of 12 competing models resulted in substantially greater relative evidence (0.9997 posterior) for a model with multiple reciprocal (bidirectional) intrinsic connections. Furthermore, the only two models within the group tested (Fig. 8) that showed consistently large model evidence on the individual subject level both belonged to the model family with bidirectional connection between the BS and early ACC clusters. Although interpretation requires caution (Lohmann et al., 2012), these results provide support for the idea that coupling between attentional systems plays a role in modulations of task-related attention during simple tasks. Other DCM studies have reported interactions of attention systems, but these focused on linking the VAN with the dorsal attention network during tasks related to spatial attention (Vossel et al., 2012) and attentional reorienting (DiQuattro et al., 2013), and these studies did not incorporate EEG-derived information.
Connectivity parameter magnitudes and signs (determined by BPA) suggest that the role of the BS in these modulations is excitatory across the duration of the trial and multiple cortical regions, whereas the rOFC plays an inhibitory role late in the trial. The integral role of the ACC in attentional modulations is likely more complex, with dependencies in both space (bilateral ACC vs right paracingulate/ACC region) and time (early 200 ms vs later 425 ms poststimulus). Specifically, activity within bilateral ACC may act to inhibit evoked BS activations early in the trial, but then rACC may drive this activity later in the trial. Studies specifically designed to investigate causality will be required to validate such claims. Our DCM results further support the adaptive gain theory of the LC–NE system, and together with the EEG-based GLM results they expand on it by providing additional information regarding the timing of involvement of these regions.
Footnotes
This work was supported by National Institutes of Health Grant R01-MH085092 and by the National Science Foundation Graduate Research Fellowship Program. We thank Glenn Castillo and Stephen Dashnaw for their assistance with EEG–fMRI data acquisition.
The authors declare no competing financial interests.
- Correspondence should be addressed to Paul Sajda, Department of Biomedical Engineering, Columbia University, 351 Engineering Terrace, MC8904, 530 West 120th Street, New York, NY 10027. psajda{at}columbia.edu