Elsevier

NeuroImage

Volume 38, Issue 1, 15 October 2007, Pages 43-56
NeuroImage

Enhanced signal detection in neuroimaging by means of regional control of the global false discovery rate

https://doi.org/10.1016/j.neuroimage.2007.07.031Get rights and content

Abstract

In the context of neuroimaging experiments, it is essential to account for the multiple comparisons problem when thresholding statistical mappings. Various methods are in use to deal with this issue, but they differ in their signal detection power for small- and large-scale effects. In this paper, we comprehensively describe a new method that is based on control of the false discovery rate (FDR). Our method increases sensitivity by exploiting the spatially clustered nature of neuroimaging effects. This is achieved by using a sliding window technique, in which FDR-control is first applied at a regional level. Thus, a new statistical map that is related to the regionally achieved FDR is derived from the available voxelwise P-values. On the basis of receiver operating characteristic (ROC) curves, thresholding based on this map is demonstrated to have better discriminatory power than conventional thresholding based on P-values. Secondly, it is shown that the resulting maps can be thresholded at a level that results in control of the global FDR. By means of statistical arguments and numerical simulations under widely varying conditions, our method is validated, characterized, and compared to some other common voxel-based methods (uncorrected thresholding, Bonferroni correction, and conventional FDR-control). It is found that our method shows considerably higher sensitivity as compared to conventional FDR-control, while still controlling the achieved FDR at the same level or better. Finally, our method is applied to two diverse neuroimaging experiments to assess its practical merits, resulting in substantial improvements as compared to the other methods.

Introduction

In neuroimaging studies, it is often desirable to reveal and map structural or functional characteristics of the brain on a detailed scale. This generally means that analyses are carried out at the level of individual volume elements (voxels). However, because the total number of considered voxels is usually enormous, the use of common statistical thresholds to assess the significance of effects in individual voxels would result in an unacceptably large number of voxels for which the null-hypothesis, that no effects are present, is falsely rejected (i.e., the number of false positive voxels). For instance, using a threshold p = 0.05, a significant effect would be detected on average in fifty voxels for every thousand voxels in the volume of interest, even if no effect is truly present in any voxel. Because the number of voxels that has to be taken into consideration can run into the hundreds of thousands, the amount of false positives could easily outnumber the amount of true positives, and results would be largely based on chance. Therefore, the use of such simple statistical tests is generally unacceptable.

This ‘multiple comparisons problem’ can be remedied by the use of a stricter statistical threshold. As a result, the proportion of voxels for which an effect is falsely detected is reduced, increasing the specificity and the reliability of the results. However, an inevitable disadvantage is that the probability for the detection of voxels with true effects will also decrease, thus reducing the sensitivity of the analysis. In literature, various methods have been proposed to find a compromise in this situation, by optimizing signal detection while still controlling some measure for the error rate (Logan and Rowe, 2004).

A common method is to adjust the threshold such that the probability for the presence of any false positives in the entire volume (the familywise error rate, FWE) is kept below some upper limit α (Nichols and Hayasaka, 2003). A distinction can be made between methods with weak and strong error control. For weak error control, the chance of falsely rejecting one or more null-hypotheses is bounded by a specified level α if the null-hypothesis holds everywhere; for strong error control, the chance of falsely rejecting one or more null-hypotheses is bounded by a specified level α for any subset of the voxels for which the null-hypothesis holds. The essential difference is that methods with strong control have the ability to localize effects, while methods with weak control only assess if effects are present anywhere in the volume.

A simple and common approach that provides strong control of the FWE is offered by the Bonferroni correction: if tests are performed on a large number N of voxels, an FWE bound α can be achieved for the entire volume by applying a stricter threshold p = α/N to each individual voxel. However, the Bonferroni correction is quite stringent and often considered too conservative. Better methods have been described (Holm, 1979, Hochberg, 1988), but analyses on neuroimaging data usually benefit little from such improvements because of the large proportion of voxels without significant effects. Also, in practice, spatial correlations may exist between neighboring voxels, requiring the use of more refined techniques like Gaussian field theory or resampling methods in order to make accurate statistical inferences (Nichols and Hayasaka, 2003, Worsley, 2005).

Some alternative methods are also based on FWE-based error control, but test the significance of effects in clusters or regions of interest as a whole instead of in individual voxels. Because this reduces the total number of tests, and because neuroimaging effects are usually clustered in nature, such cluster- and set-level inferences are generally more powerful than voxel-level inferences. However, such inferences are less regionally specific because they cannot be made at the level of individual voxels and only apply to an entire cluster or set of clusters. Various methods have been proposed that compromise between sensitivity and regional specificity in different ways (Worsley et al., 1995, Friston et al., 1996, Poline et al., 1997, Heller et al., 2006), some of which offer the potential to detect both weak but extensive diffuse effects and strong but confined focal effects at the same time.

A completely different approach to error control in neuroimaging experiments is provided by relatively new developments that are not based on FWE, but that limit the false discovery rate (FDR) (Benjamini and Hochberg, 1995, Genovese et al., 2002, Laird et al., 2005, Singh and Dan, 2006). The FDR is the proportion of incorrect rejections of the null-hypothesis among the total number of rejections, or in other words the proportion of false positives among all the positives. This error measure addresses the issue that it is often permissible for the null-hypothesis to be falsely rejected in some voxels, as long as these constitute a negligible fraction in comparison with the total number of voxels for which the null-hypothesis is rejected. This is for instance the case if one is interested in large-scale spatial patterns in the brain, or if summary values are calculated over the detected regions (e.g., the mean level, spatial extent, or lateralization index of neural effects).

FDR-controlling procedures are claimed to be more powerful, and FDR-control has been predicted to overtake FWE-control as the most common measure to limit the number of false positives (Nichols and Hayasaka, 2003). Current FDR-related methods can be advantageous over FWE-related methods especially if true positive regions comprise many voxels and constitute a notable proportion of the total volume of interest. In such cases, a fair number of false positives can be allowed without much affecting the overall outcome, leading to tolerant statistical thresholds and improved sensitivity.

In this paper, we present an extension to Benjamini’s FDR-controlling method that is based on regional analyses to exploit the generally clustered nature of true neuroimaging effects. Our method will be substantiated and validated, and we will show that our method usually results in better sensitivity than Benjamini’s original method while it is still possible to achieve identical global control of the FDR. In particular, our method provides enhanced sensitivity to large brain volumes with weak effects while still imposing strong thresholds on small isolated foci, similar to cluster- and set-level inferences in FWE-related methods. Our method will be tested and compared to other commonly used methods using numerical simulations. Furthermore, to assess the practical feasibility and benefits, the new method will be applied to two distinct neuroimaging experiments that employ diffusion weighted imaging (DWI) and functional magnetic resonance imaging (fMRI) to focus on local structural and functional changes in the brain, respectively.

Section snippets

Theory

In this paper, an uppercase notation (i.e., P and Q) will be used to denote statistics that are attained in individual voxels, while a lowercase notation (i.e., p and q) will be used to indicate the thresholds that are imposed on the corresponding maps. Furthermore, regarding FDR-control, an explicit distinction will be made between global and regional bounds by means of subscripts (i.e., qglobal and qregional).

Simulations

Numerical simulations with artificially generated neuroimaging data were performed in the MatLab programming environment while varying a number of configuration parameters, to assess the validity and test the performance of our proposed method in a controlled manner, and to compare its results with those of some frequently used other voxel-based methods.

A 3D data matrix was filled with normally distributed pseudo-random noise. The default dimensions of the volume were 24 × 24 × 16. To model spatial

Simulations

Fig. 2 illustrates simulated data using default parameter settings, along with the results that were obtained using the four different thresholding methods. Fig. 2a displays the simulated signal in a single slice, consisting of normally distributed noise in the entire image in addition to a constant signal that is confined to the outlined central square area. In Fig. 2b, the voxels in this slice that are assessed to contain significant effects according to various methods are highlighted. By

Validity

In this paper, we described a new method to control the global FDR. By taking the clustered nature of neuroimaging effects into account by means of regional weightings, the sensitivity of the method is enhanced as compared to conventional (global) FDR-controlling methods. We have provided informal quantitative arguments that back up our method in the Theory section and Appendices, but did not prove our reasoning with mathematical rigor. To validate our method for practical purposes, a wide

Acknowledgments

The authors would like to acknowledge the contribution of the anonymous reviewers, who provided valuable comments that inspired a thorough revision of the method, which led to enormous improvements of its validity and usability. Also, we acknowledge the contribution of K. Jaspers who participated in the acquisition of the fMRI data.

Cited by (57)

  • Diverging volumetric trajectories following pediatric traumatic brain injury

    2017, NeuroImage: Clinical
    Citation Excerpt :

    There were several other clusters of significant group differences, detailed in Table 3 and Fig. 2. Results were corrected for multiple comparisons using searchlight FDR (Langers et al., 2007) (q < 0.05). Given the significant group differences in the IC, we additionally charted the volume of the right IC at both time points across participants in all 3 groups, shown in Fig. 3.

View all citing articles on Scopus
View full text