Denoising the speaking brain: Toward a robust technique for correcting artifact-contaminated fMRI data under severe motion
Introduction
Extracting valid biological signals in the presence of complex and often overwhelming sources of artifacts (Caparelli, 2005) has always been a vexing problem for functional magnetic resonance imaging (fMRI) — the most popular neuroimaging technique for mapping brain activity. Unfortunately, while a proliferating number of fMRI studies have appeared over the past two decades, the impact of imaging artifacts in cognitive neuroscience research may have been significantly underestimated. Recently, increasing attention has been paid to this issue because neglecting it may have led to spurious results and misleading theories arising from artifacts correlated with experimental manipulations (Deen and Pelphrey, 2012). For example, it was found that a leading theory on the cause of autism (Power et al., 2010) might have been undermined by a systematic difference in the amount of head motion between subject groups (Power et al., 2012). Similar investigations (Mowinckel et al., 2012, Van Dijk et al., 2012) also identified potential confounds in an fMRI study that had proposed a novel mechanism for cognitive decline in normal aging (Andrews-Hanna et al., 2007).
Although vigorous discussions on this issue have now been mostly focused on head motion effects in resting-state connectivity (e.g., Power et al., 2014, Satterthwaite et al., 2012, Yan et al., 2013), it actually has much broader impact on the entire range of fMRI experiments. The confounding effects of stimulus correlated motion have long been recognized (Hajnal et al., 1994) in the analysis of task-based activity. In the most severe cases, e.g., studies of continuous overt speech production, researchers need to rely on other imaging techniques instead of conventional fMRI (Kemeny et al., 2005), due to the heavy distortion of blood oxygenation level dependent (BOLD) signals caused by various movement-related mechanisms (Barch et al., 1999, Birn et al., 1998).
Until now, one of the most commonly used methods for dealing with motion-related artifacts is a technique called scrubbing (Power et al., 2012), also known as frame or volume censoring (Fair et al., 2012, Power et al., 2014), which identifies and rejects noise-contaminated images based on a set of criteria for estimating the degree of motion or amount of artifactual changes in image intensity: e.g., framewise displacement (FD), an empirical sum of the rigid-body motion between consecutive images in all directions; and DVARS, a whole-brain measure of the temporal derivative (D) of image intensity computed by taking the root-mean-square variance across voxels (VARS). Although this method is straightforward to understand and easy to apply, it has at least three apparent limitations: 1) statistical power is reduced because of the rejection of images, especially when there is a significant degree of motion present in the data; 2) artifacts with potential detrimental effects, though not meeting the threshold for rejection, still exist in the remaining images; 3) inability to derive continuous time series may jeopardize analytical methods that depend upon on an unbroken temporal sequence of images, e.g., methods utilizing causality, periodicity, phase, and entropy measures.
These significant limitations have created a growing demand for development of a robust technique – whether data-driven or model-based – that can thoroughly remove all major sources of artifacts, and, critically, can preserve the integrity of continuous fMRI time series. Here we present a blind source separation (BSS) technique based on spatial independent component analysis (sICA) that addresses these demands. We believe that it represents an effective solution for the following two reasons.
First, a BSS technique eliminates the need to obtain accurate predictor measurements or to establish quantitative relationships between motion predictors and imaging artifacts, both of which are required in model-based denoising. This feature is particularly important given the complex and nonlinear mechanisms by which the fMRI artifacts are generated (Caparelli, 2005). For example, the use of Volterra expanded rigid-body alignment parameters as nuisance covariates (which is a typical example of a general class of model-based denoising methods called nuisance variable regression; Lund et al., 2006) can reduce certain effects of head motion such as the spin history effect (Friston et al., 1996), but fails to account for other mechanisms of residual head motion such as susceptibility-by-motion interaction (Andersson et al., 2001, Wu et al., 1997), or effects due to non-rigid motion that are present in only a fraction of slices during multislice echo planar imaging (EPI). Another popular denoising method, RETROICOR (Retrospective Image-Based Correction; Glover et al., 2000), removes physiological noise based on predictors computed from auxiliary cardiac and respiratory recordings. But its effectiveness in practical application often suffers from inaccuracies in cardiac/respiratory peak detection caused by measurement noise of these auxiliary recordings.
Second, because sICA optimizes spatial rather than temporal independence, and utilizes higher-order statistics rather than simple correlation (Calhoun and Adali, 2006), it is ideally suited for the removal of task-correlated motion, which inevitably affects many of the interactive tasks requiring either overt motor (Field et al., 2000) or verbal (Barch et al., 1999) responses. These are typical instances in which a regression model fails to give an unbiased estimate due to multicollinearity between artifacts and effects of interest (Johnstone et al., 2006).
However, certain features of the BSS approach have prevented the wide adoption of sICA as the method of choice for fMRI denoising. One major obstacle has been the lack of a common ground truth for identifying “what is signal and what is noise” (McKeown et al., 2003). While the neural mechanisms of signal components may vary experiment by experiment, each type of structured noise should have common characteristics that can be systematically studied according to their physical or physiological mechanisms (e.g., those described in Lund et al., 2006). Such a quantitative and mechanistic classification scheme has yet to be established, although an early publication (McKeown et al., 1998) included some qualitative descriptions of a very limited number of stereotyped components. Another study (Kelly et al., 2010) attempted to characterize sICA components by highly subjective and often ambiguous visual appearances such as the “spottiness” and “peripheralness” of component maps or the frequency spectra and spike distributions of component time courses, without any concrete measures related to their mechanisms of generation. A classification scheme purely based on these visual appearances may yield misleading results even if the inter-rater agreement is high. This is because common errors among raters may be driven by spatial overlaps between focal artifacts and cortical structures or by temporal similarity between task-correlated motion and cerebral activity.
The second, closely related issue is the need to develop a reliable computational method to automate the binary classification of signal and noise components. There have been several published studies aimed at resolving this issue, but their practical utility is generally limited by some common problems. First, due to the lack of a ground truth, the accuracies of these methods were either completely untested (Kochiyama et al., 2005, Kundu et al., 2012, Thomas et al., 2002) or only tested against the subjective classification scores of one or two human experts whose operational criteria or inter-rater reliability was often not reported (Bhaganagarapu et al., 2013, De Martino et al., 2007, Perlbarg et al., 2007, Tohka et al., 2008). Second, the quantitative measures (i.e., features) used for classification, which are usually based on the temporal (Kochiyama et al., 2005, Perlbarg et al., 2007, Rummel et al., 2013), spectral (Thomas et al., 2002), spatial or combined (Bhaganagarapu et al., 2013, De Martino et al., 2007, Tohka et al., 2008) properties of each component (as an exception, see Kundu et al., 2012), either were arbitrarily selected or had limited applicability due to an uncommon experimental setup (Kochiyama et al., 2005, Kundu et al., 2012, Thomas et al., 2002). A systematic method for individual feature selection is still lacking for the binary classification of sICA components. Third, the thresholds of these classification features were determined by arbitrary tuning (Kundu et al., 2012, Perlbarg et al., 2007, Rummel et al., 2013) or supervised learning (De Martino et al., 2007, Tohka et al., 2008) based only on a few pre-labeled datasets, thus the generalizability of these methods may be unreliable due to the variation of ICA components across datasets.
The next unresolved question is: by what means is it possible to effectively validate the results of denoising? This is especially critical for a BSS technique such as sICA because the signal-noise separability of fMRI data after source decomposition is largely untested. Since one of the most problematic issues caused by imaging artifacts is the potential introduction of both false positives and false negatives, the effectiveness of denoising should not be solely evaluated by the increase or decrease of task-based activity or resting-state connectivity in the absence of an absolute reference. This appears to be another common problem in the previous investigations (e.g., Kundu et al., 2012, Tohka et al., 2008).
In brief, previous implementations of sICA as a data-driven denoising approach, while theoretically sound and well intentioned, have essentially remained conceptual. The primary goal of this study is to present a robust technique, as well as a complete and general framework for empirical evaluation of existing and future sICA-based denoising techniques, by thoroughly resolving each of the fundamental issues outlined above.
One of the important methodological advances represented by our technique is based on the novel observation that by expanding the analysis mask of sICA to whole-head coverage, fMRI intensities in extracerebral soft tissues (e.g., muscles, arteries, and ocular structures) or air cavities (e.g., larynx and frontal sinus) where artifacts may originate, can be directly revealed in the same components that contain artifacts within brain tissue (Fig. 1b). These extracerebral noise sources, which are usually obscured by their low intensity in the other types of analysis, not only provide salient spatial information for the classification of a variety of noise components, but also help identify their potential mechanisms of generation.
Moreover, we are also able to corroborate these mechanisms by examining the temporal correlations between component time courses and existing measurements of physical or physiological motion (Fig. 1c). Although we do not recommend using these auxiliary measurements directly for model-based denoising due to the various reasons mentioned above (measurement or prediction inaccuracy, multicolinearity, etc.), they nevertheless play an important role in identifying the potential source mechanisms after BSS.
Crucially, our technique incorporates a dual-mask method with spatially matched components in both a whole-brain analysis mask and a whole-head analysis mask (Fig. 1a). As a key bridge toward a mechanistic classification of sICA components, this innovative procedure not only overcomes a known trade-off between the analysis mask size and the spatial discriminatory power of ICA (Formisano et al., 2002), but also allows a more accurate estimation of component time courses that represent the intracerebral dynamics of interest.
The mechanistic classification scheme derived from the above methods also provides a methodological basis for designing and validating automated computer algorithms aimed at binary component classification. Furthermore, rather than relying on arbitrarily selected features or algorithms with limited applicability, we present a general framework for guiding the design of automated classifiers, and for evaluating their performance and generalizability. In particular, the two quantitative criteria that we proposed for feature selection – sensitivity index and bimodality coefficient – can be applied to estimating the performance of a wide range of classification features. The machine learning classifier developed with our technique, which is based on a simple set of spatial features and an unsupervised expectation maximization algorithm, achieves a near perfect accuracy and sufficient generalization for fully-automated (i.e., without further needs for human verification) and broad (i.e., in a variety of experimental paradigms) applications.
As a proof of principle, the power of our technique was demonstrated in the context of imaging continuous overt speech production. The reason for selecting speech production for our primary investigation is not only because it represents one of the most egregious examples of experimentally induced artifacts, well documented by a series of studies (Barch et al., 1999, Birn et al., 1998, Kemeny et al., 2005, Mehta et al., 2006), but also due to the availability of a true estimate of task-based activity for evaluating the effectiveness of denoising.
A true estimate of task-based activity cannot be obtained by using BOLD fMRI alone because most of the artifacts are intrinsically associated with the physical aspects of magnetic fields employed by the technique (e.g., field inhomogeneity and magnetic susceptibility; Caparelli, 2005). In this study, positron emission tomography (PET) – the de facto “gold standard” for imaging continuous overt speech production (Kemeny et al., 2005) – was used as a vehicle for cross-modal validation, since these artifacts are clearly absent in PET (although this imaging modality is less commonly used nowadays in cognitive research due to radiation dose limitations and a relatively poor temporal resolution; see Supplementary Appendix A for more details on why and how to use PET as a reference measure for our purposes).
Finally, the general applicability of our technique was investigated using the resting-state data published by Power et al. (2012). Although a true estimate of resting-state connectivity is currently not available, several quantitative measures, such as FD, DVARS, and the distant-dependent effect of head motion, have been systematically investigated in a series of studies (Fair et al., 2012, Power et al., 2012, Power et al., 2014). Hence, they can be utilized as endpoints to compare the performance of the present technique with previous methods in a quantitative way.
Section snippets
Subjects and experimental paradigm
Eighteen healthy, right-handed, native English speakers (7 males, 11 females; aged 20–32 years) participated in this study. All participants were scanned in an fMRI experiment and 17 of them participated in a subsequent PET experiment. Approval for these experiments was obtained from the institutional review board of the National Institutes of Health.
Prior to the experiments, participants were trained to be familiar with all stimulus materials including narrative stories and pseudowords. Each of
Characterization and mechanistic classification of structured noise components
The denoising technique presented here starts with and builds upon the fundamental understanding of the spatiotemporal characteristics of sICA components, which includes three aspects. First, descriptive characteristics for defining each noise category are derived from more than a thousand components across 18 datasets collected during different task states (speech production/comprehension and resting fixation). Second, a set of spatial features are derived from the above characteristics for
Conclusions
In conclusion, our denoising technique can be applied in a variety of experimental paradigms for improving the reliability of fMRI measurements. The entire procedure is fully automated and has minimal impact on other features of conventional data processing. Both the mechanistic component classification scheme that is proposed as a ground truth of denoising, and the general framework for designing/evaluating automated component classifiers, appear to achieve their goals in the current study;
Acknowledgments
This study was supported by the NIH Intramural Research Program (Protocol 92-DC-0178). We thank Venkata S. Mattay, S. Lalith Talagala and Souheil J. Inati for valuable comments on the manuscript; and Caroline F. Zink for helpful discussions. We are also particularly grateful to Jonathon D. Power for kindly sharing the resting-state fcMRI data.
References (62)
- et al.
Modeling geometric deformations in EPI time series
Neuroimage
(2001) - et al.
Disruption of large-scale brain systems in advanced aging
Neuron
(2007) - et al.
Unified segmentation
Neuroimage
(2005) - et al.
Overt verbal responding during fMRI scanning: empirical investigations of problems and potential solutions
Neuroimage
(1999) - et al.
Experimental designs and processing strategies for fMRI studies involving overt verbal responses
Neuroimage
(2004) - et al.
Separating respiratory-variation-related fluctuations from neuronal-activity-related fluctuations in fMRI
Neuroimage
(2006) - et al.
Neural systems supporting lexical search guided by letter and semantic category cues: a self-paced overt response fMRI study of verbal fluency
Neuroimage
(2010) AFNI: software for analysis and visualization of functional magnetic resonance neuroimages
Comput. Biomed. Res.
(1996)- et al.
Localization of cardiac-induced signal change in fMRI
Neuroimage
(1999) - et al.
Classification of fMRI independent components using IC-fingerprints and support vector machine classifiers
Neuroimage
(2007)
Whole brain high-resolution functional imaging at ultra high magnetic fields: an application to the analysis of resting state networks
Neuroimage
Spatial independent component analysis of functional magnetic resonance imaging time-series: characterization of the cortical components
Neurocomputing
Characterization and correction of interpolation effects in the realignment of fMRI time series
Neuroimage
Independent component analysis: algorithms and applications
Neural Netw.
Visual inspection of independent components: defining a procedure for artifact removal from fMRI data
J. Neurosci. Methods
Removing the effects of task-related motion using independent-component analysis
Neuroimage
Differentiating BOLD and non-BOLD signals in fMRI time series using multi-echo EPI
Neuroimage
Non-white noise in fMRI: does modelling have an impact?
Neuroimage
A method for removal of global effects from fMRI time series
Neuroimage
Deterministic and stochastic features of fMRI data: implications for analysis of event-related experiments
J. Neurosci. Methods
Independent component analysis of functional MRI: what is signal and what is noise?
Curr. Opin. Neurobiol.
Analysis of speech-related variance in rapid event-related fMRI using a time-aware acquisition system
Neuroimage
Network-specific effects of age and in-scanner subject motion: a resting-state fMRI study of 238 healthy adults
Neuroimage
CORSICA: correction of structured noise in fMRI by automatic identification of ICA components
Magn. Reson. Imaging
The development of human functional brain networks
Neuron
Spurious but systematic correlations in functional connectivity MRI networks arise from subject motion
Neuroimage
Methods to detect, characterize, and remove motion artifact in resting state fMRI
Neuroimage
Impact of in-scanner head motion on multiple measures of functional connectivity: relevance for studies of neurodevelopment in youth
Neuroimage
Noise reduction in BOLD-based fMRI using component analysis
Neuroimage
Automatic independent component labeling for artifact removal in fMRI
Neuroimage
The influence of head motion on intrinsic functional connectivity MRI
Neuroimage
Cited by (30)
The effects of a simulated fMRI environment on voice intensity in individuals with Parkinson's disease hypophonia and older healthy adults
2021, Journal of Communication DisordersICA-based denoising strategies in breath-hold induced cerebrovascular reactivity mapping with multi echo BOLD fMRI
2021, NeuroImageCitation Excerpt :It is important that the signal variance associated with these confounding signals is accounted for and minimized during preprocessing or data analyses (Caballero-Gaudes and Reynolds, 2017; Liu, 2016). Head motion is a particularly problematic source of noise for task-based fMRI experiments, mainly in block designs (Johnstone et al., 2006) and in particular experimental paradigms, such as in overt speech production (Barch et al., 1999; Soltysik and Hyde, 2006; Xu et al., 2014). This concern with task-induced movement artefacts extends to respiration tasks: the experimental design is similar to that of block designs, but the amount of motion associated with paced breathing, deep breaths, or “recovery” breaths following a BH task can be very prominent and concur with the pattern of the task.
Hypoconnectivity of insular resting-state networks in adolescents with Autism Spectrum Disorder
2019, Psychiatry Research - NeuroimagingCitation Excerpt :Dimensionality of the ICA reduction was automatic. Independent components defined by ICA (FSL MELODIC) were classified as noise using spatial and temporal characteristics detailed in the MELODIC (FSL) manual (http://www.fmrib.ox.ac.uk/fslcourse/lectures/melodic.pdf) and described in Kelly et al. (2010) and Xu et al. (2014). Noise included head motion (e.g., “rim-like” artifacts around the brain, spikes in time series), scanner artifacts (e.g., slice dropouts, high frequency noise, field inhomogeneities), and physiological noise (e.g., respiration, cardiac frequencies, white matter signal, ventricular/cerebrospinal fluid fluctuations, frontal air cavities, ocular structures).
Reduced Functional Brain Activation and Connectivity During a Working Memory Task in Childhood-Onset Schizophrenia
2018, Journal of the American Academy of Child and Adolescent PsychiatryCitation Excerpt :The structural MRI scan was coregistered to the functional images and then segmented and normalized into Montreal Neurological Institute space in SPM12.35 To further remove susceptibility artifacts generated by motion and physiologic noise (blood pulsation, respiration, etc.), we applied the dual-mask spatial independent component analysis to the motion-corrected and slice-time–corrected functional data.36 The de-noised functional data were normalized into the Montreal Neurological Institute space at a voxel size of 3 × 3 × 3 mm and smoothed to a target full-width half maximum of 8 mm.