Abstract
To facilitate collaborative research efforts between multi-investigator teams using DNA microarrays, we identified sources of error and data variability between laboratories and across microarray platforms, and methods to accommodate this variability. RNA expression data were generated in seven laboratories, which compared two standard RNA samples using 12 microarray platforms. At least two standard microarray types (one spotted, one commercial) were used by all laboratories. Reproducibility for most platforms within any laboratory was typically good, but reproducibility between platforms and across laboratories was generally poor. Reproducibility between laboratories increased markedly when standardized protocols were implemented for RNA labeling, hybridization, microarray processing, data acquisition and data normalization. Reproducibility was highest when analysis was based on biological themes defined by enriched Gene Ontology (GO) categories. These findings indicate that microarray results can be comparable across multiple laboratories, especially when a common platform and set of procedures are used.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 12 print issues and online access
$259.00 per year
only $21.58 per issue
Rent or buy this article
Prices vary by article type
from$1.95
to$39.95
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Accession codes
References
Quackenbush, J. Computational analysis of microarray data. Nat. Rev. Genet. 2, 418–427 (2001).
Salter, A.H. & Nilsson, K.C. Informatics and multivariate analysis of toxicogenomics data. Curr. Opin. Drug Discov. Devel. 6, 117–122 (2003).
Nadon, R. & Shoemaker, J. Statistical issues with microarrays: processing and analysis. Trends Genet. 18, 265–271 (2002).
Spruill, S.E., Lu, J., Hardy, S. & Weir, B. Assessing sources of variability in microarray gene expression data. Biotechniques 33, 916–923 (2002).
Tan, P.K. et al. Evaluation of gene expression measurements from commercial microarray platforms. Nucleic Acids Res. 31, 5676–5684 (2003).
Yang, Y.H. & Speed, T. Design issues for cDNA microarray experiments. Nat. Rev. Genet. 3, 579–588 (2002).
Marshall, E. Getting the noise out of gene arrays. Science 306, 630–631 (2004).
Becker, K.G. The sharing of cDNA microarray data. Nat. Rev. Neurosci. 2, 438–440 (2001).
Miles, M.F. Microarrays: lost in a storm of data. Nat. Rev. Neurosci. 2, 440–443 (2001).
Ball, C.A. et al. Standards for microarray data. Science 298, 539 (2002).
Campbell, P. Microarray standards at last. Nature 418, 323 (2002).
Kim, H. et al. Use of RNA and genomic DNA references for inferred comparisons in DNA microarray analyses. Biotechniques 33, 924–930 (2002).
Eisen, M.B. & Brown, P.O. DNA arrays for analysis of gene expression. Methods Enzymol. 303, 179–205 (1999).
Cronin, M. et al. Universal RNA reference material for gene expression. Clin. Chem. 50, 1464–1471 (2004).
Kerr, M.K. & Churchill, G.A. Experimental design for gene expression microarrays. Biostatistics 2, 183–201 (2001).
Kerr, M.K. Experimental design to make the most of microarray results. Methods Mol. Biol. 224, 137–147 (2003).
Irizarry, R.A. et al. Summaries of Affymetrix GeneChip probe level data. Nucleic Acids Res. 31, e15 (2003).
Wolfinger, R. et al. Assessing gene significance from cDNA microarray expression data via mixed models. J. Comput. Biol. 8, 625–637 (2001).
Hosack, D.A., Dennis, G., Sherman, B.T., Lane, H.C. & Lempicki, R.A. Identifying biological themes within lists of genes with EASE. Genome Biol. 4, R60 (2003).
Hyduke, D.R., Rohlin, L., Kao, K.C. & Liao, J.C. A software package for cDNA microarray normalization and assessing confidence intervals. OMICS 7, 227–234 (2003).
Tseng, G.C., Oh, M.K., Rohlin, L., Liao, J.C. & Wong, W.H. Issues in cDNA microarray Analysis: quality filtering, channel normalization, models of variations and assessment of gene effects. Nucleic Acids Res. 29, 2549 2557 (2001).
Benjamini, Y. & Hochberg, Y. Controlling the false discovery rate: a powerful approach to multiple testing. J. R. Stat. Soc. (Ser A) 57, 289 (1995).
Acknowledgements
We thank J. Quackenbush from The Institute for Genomic Research, L. Hartwell from Fred Hutchinson Cancer Research Center and R. Wolfinger from the SAS Institute for their scientific contributions. We thank K.J. Yost (Science Applications International) and P. Cozart (NIEHS ITSS) for their information technology support. Research support was provided by National Institutes of Environmental Health Sciences grants ES11375, ES11384, ES11387, ES11391 and ES11399, and Contract # N01-ES-25497.
Author information
Authors and Affiliations
Consortia
Corresponding author
Ethics declarations
Competing interests
The author declare no competing financial interests.
Additional information
A list of authors and their affiliations appears in the Supplementary Note
Supplementary information
Supplementary Fig. 1
Clustering of laboratory/platform combinations based on log ratio values associated with the common genes. (PDF 752 kb)
Supplementary Table 1
Within and between laboratory median Pearson correlation coefficients of log intensities from standard array experiments. (PDF 92 kb)
Supplementary Table 2
Within and between laboratory median Pearson correlation coefficients of log ratios (LvsP) for standard array experiments using different preprocessing. (PDF 84 kb)
Supplementary Table 3
Common Gene Elements Across All Platforms (Standard and Resident Arrays): Mapping to NIA NAP Clusters. (PDF 134 kb)
Supplementary Table 4
Percent overlap of significantly induced and repressed genes across laboratories for the Dataset D and Dataset C and number of gene transcripts identified as differentially expressed across laboratories for Dataset D and Dataset C. (PDF 91 kb)
Supplementary Table 5
Percentage of the functionally-enriched GO Nodes that demonstrate different levels of concordance within and between branches of the clustering dendrogram. (PDF 47 kb)
Supplementary Note
Author list (PDF 63 kb)
Rights and permissions
About this article
Cite this article
Members of the Toxicogenomics Research Consortium. Standardizing global gene expression analysis between laboratories and across platforms. Nat Methods 2, 351–356 (2005). https://doi.org/10.1038/nmeth754
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/nmeth754
This article is cited by
-
Identifying essential proteins from protein–protein interaction networks based on influence maximization
BMC Bioinformatics (2022)
-
Differential gene expression in disease: a comparison between high-throughput studies and the literature
BMC Medical Genomics (2017)
-
Tumor Prognostic Factors and the Challenge of Developing Predictive Factors
Current Oncology Reports (2013)
-
Properties of signal intensities observed with individual probes of GeneChip Rat Gene 1.0 ST Array, an affymetric microarray system
Biotechnology Letters (2012)
-
High-throughput processing and normalization of one-color microarrays for transcriptional meta-analyses
BMC Bioinformatics (2011)