Abstract
Microarray-based assays have significantly expanded their scope and range of applications over the last 10 years, and – at least for gene expression – can be considered mainstream applications. High-throughput, microarray-based gene expression studies have proven particularly useful in the study of neurodegenerative diseases, for which they have provided key insights in understanding disease pathogenesis, regional and cellular specificity, and identification of therapeutic targets. Even though many experimental steps are currently performed in specialized core facilities, the key steps of a microarray study – experimental design, and data analysis and interpretation – are performed by the primary investigator. Knowledge of the issues related to these key steps is essential to properly perform and interpret a microarray experiment and constitutes the main focus of the present chapter. The basic analytical steps are covered, and annotated R code for the analysis of a published dataset is provided.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Schena, M., Shalon, D., Davis, R. W., and Brown, P. O. (1995) Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science. 270, 467–470.
Shi, L., Reid, L. H., Jones, W. D., Shippy, R., Warrington, J. A., Baker, S. C., Collins, P. J., de Longueville, F., Kawasaki, E. S., Lee, K. Y., Luo, Y., Sun, Y. A., Willey, J. C., Setterquist, R. A., Fischer, G. M., Tong, W., Dragan, Y. P., Dix, D. J., Frueh, F. W., Goodsaid, F. M., Herman, D., Jensen, R. V., Johnson, C. D., Lobenhofer, E. K., Puri, R. K., Schrf, U., Thierry-Mieg, J., Wang, C., Wilson, M., Wolber, P. K., Zhang, L., Amur, S., Bao, W., Barbacioru, C. C., Lucas, A. B., Bertholet, V., Boysen, C., Bromley, B., Brown, D., Brunner, A., Canales, R., Cao, X. M., Cebula, T. A., Chen, J. J., Cheng, J., Chu, T. M., Chudin, E., Corson, J., Corton, J. C., Croner, L. J., Davies, C., Davison, T. S., Delenstarr, G., Deng, X., Dorris, D., Eklund, A. C., Fan, X. H., Fang, H., Fulmer-Smentek, S., Fuscoe, J. C., Gallagher, K., Ge, W., Guo, L., Guo, X., Hager, J., Haje, P. K., Han, J., Han, T., Harbottle, H. C., Harris, S. C., Hatchwell, E., Hauser, C. A., Hester, S., Hong, H., Hurban, P., Jackson, S. A., Ji, H., Knight, C. R., Kuo, W. P., LeClerc, J. E., Levy, S., Li, Q. Z., Liu, C., Liu, Y., Lombardi, M. J., Ma, Y., Magnuson, S. R., Maqsodi, B., McDaniel, T., Mei, N., Myklebost, O., Ning, B., Novoradovskaya, N., Orr, M. S., Osborn, T. W., Papallo, A., Patterson, T. A., Perkins, R. G., Peters, E. H., Peterson, R., Philips, K. L., Pine, P. S., Pusztai, L., Qian, F., Ren, H., Rosen, M., Rosenzweig, B. A., Samaha, R. R., Schena, M., Schroth, G. P., Shchegrova, S., Smith, D. D., Staedtler, F., Su, Z., Sun, H., Szallasi, Z., Tezak, Z., Thierry-Mieg, D., Thompson, K. L., Tikhonova, I., Turpaz, Y., Vallanat, B., Van, C., Walker, S. J., Wang, S. J., Wang, Y., Wolfinger, R., Wong, A., Wu, J., Xiao, C., Xie, Q., Xu, J., Yang, W., Zhong, S., Zong, Y., and Slikker, W., Jr. (2006) The MicroArray Quality Control (MAQC) project shows inter- and intraplatform reproducibility of gene expression measurements. Nat Biotechnol. 24, 1151–1161.
Shi, L., Campbell, G., Jones, W., Campagne, F., Wen, Z., Walker, S., Su, Z., Chu, T.-M., Goodsaid, F., Pusztai, L., Shaughnessy, J., Oberthuer, A., Thomas, R., Paules, R., Fielden, M., Barlogie, B., Chen, W., Du, P., Fischer, M., Furlanello, C., Gallas, B., Ge, X., Megherbi, D., Symmans, F., Wang, M., Zhang, J., Bitter, H., Brors, B., Bushel, P., Bylesjo, M., Chen, M., Cheng, J., Cheng, J., Chou, J., Davison, T., Delorenzi, M., Deng, Y., Devanarayan, V., Dix, D., Dopazo, J., Dorff, K., Elloumi, F., Fan, J., Fan, S., Fan, X., Fang, H., Gonzaludo, N., Hess, K., Hong, H., Huan, J., Irizarry, R., Judson, R., Juraeva, D., Lababidi, S., Lambert, C., Li, L., Li, Y., Li, Z., Lin, S., Liu, G., Lobenhofer, E., Luo, J., Luo, W., McCall, M., Nikolsky, Y., Pennello, G., Perkins, R., Philip, R., Popovici, V., Price, N., Qian, F., Scherer, A., Shi, T., Shi, W., Sung, J., Thierry-Mieg, D., Thierry-Mieg, J., Thodima, V., Trygg, J., Vishnuvajjala, L., Wang, S. J., Wu, J., Wu, Y., Xie, Q., Yousef, W., Zhang, L., Zhang, X., Zhong, S., Zhou, Y., Zhu, S., Arasappan, D., Bao, W., Lucas, A. B., Berthold, F., Brennan, R., Buness, A., Catalano, J., Chang, C., Chen, R., Cheng, Y., Cui, J., Czika, W., Demichelis, F., Deng, X., Dosymbekov, D., Eils, R., Feng, Y., Fostel, J., Fulmer-Smentek, S., Fuscoe, J., Gatto, L., Ge, W., Goldstein, D., Guo, L., Halbert, D., Han, J., Harris, S., Hatzis, C., Herman, D., Huang, J., Jensen, R., Jiang, R., Johnson, C., Jurman, G., Kahlert, Y., Khuder, S., Kohl, M., Li, J., Li, M., Li, Q.-Z., Li, S., Li, Z., Liu, J., Liu, Y., Liu, Z., Meng, L., Madera, M., Martinez-Murillo, F., Medina, I., Meehan, J., Miclaus, K., Moffitt, R., Montaner, D., Mukherjee, P., Mulligan, G., Neville, P., Nikolskaya, T., Ning, B., Page, G., Parker, J., Parry, M., Peng, X., Peterson, R., Phan, J., Quanz, B., Ren, Y., Riccadonna, S., Roter, A., Samuelson, F., Schumacher, M., Shambaugh, J., Shi, Q., Shippy, R., Si, S., Smalter, A., Sotiriou, C., Soukup, M., Staedtler, F., Steiner, G., Stokes, T., Sun, Q., Tan, P.-Y., Tang, R., Tezak, Z., Thorn, B., Tsyganova, M., Turpaz, Y., Vega, S., Visintainer, R., von Frese, J., Wang, C., Wang, E., Wang, J., Wang, W., Westermann, F., Willey, J., Woods, M., Wu, S., Xiao, N., Xu, J., Xu, L., Yang, L., Zeng, X., Zhang, J., Zhang, L., Zhang, M., Zhao, C., Puri, R., Scherf, U., Tong, W., and Wolfinger, R. (2010) The MicroArray Quality Control (MAQC)-II study of common practices for the development and validation of microarray-based predictive models. Nat Biotechnol. 28, 827–838.
Edgar, R., Domrachev, M., and Lash, A. E. (2002) Gene Expression Omnibus: NCBI gene expression and hybridization array data repository. Nucleic Acids Res. 30, 207–210.
Lange, P. S., Chavez, J. C., Pinto, J. T., Coppola, G., Sun, C. W., Townes, T. M., Geschwind, D. H., and Ratan, R. R. (2008) ATF4 is an oxidative stress-inducible, prodeath transcription factor in neurons in vitro and in vivo. J Exp Med. 205, 1227–1242.
R Development Core Team (2008) R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0, URL http://www.R-project.org.
Gentleman, R., Carey, V., Bates, D., Bolstad, B., Dettling, M., Dudoit, S., Ellis, B., Gautier, L., Ge, Y., Gentry, J., Hornik, K., Hothorn, T., Huber, W., Iacus, S., Irizarry, R., Leisch, F., Li, C., Maechler, M., Rossini, A., Sawitzki, G., Smith, C., Smyth, G., Tierney, L., Yang, J., and Zhang, J. (2004) Bioconductor: open software development for computational biology and bioinformatics. Genome Biol. 5, R80.
Lee, M. L., and Whitmore, G. A. (2002) Power and sample size for DNA microarray studies. Stat Med. 21, 3543–3570.
Klur, S., Toy, K., Williams, M., and Certa, U. (2004) Evaluation of procedures for amplification of small-size samples for hybridization on microarrays. Genomics. 83, 508–517.
Wilson, C. L., Pepper, S. D., Hey, Y., and Miller, C. J. (2004) Amplification protocols introduce systematic but reproducible errors into gene expression studies. Biotechniques. 36, 498–506.
Jones, L., Yue, S., Cheung, C.-Y., and Singer, V. (1998) RNA Quantitation by Fluorescence-Based Solution Assay: RiboGreen Reagent Characterization. Analytical Biochemistry. 265, 368–374.
Fan, J.-B., Gunderson, K., Bibikova, M., Yeakley, J., Chen, J., Wickham Garcia, E., Lebruska, L., Laurent, M., Shen, R., and Barker, D. (2006) Illumina universal bead arrays. Methods Enzymol. 410, 57–73.
Yang, Y. H., Dudoit, S., Luu, P., Lin, D. M., Peng, V., Ngai, J., and Speed, T. P. (2002) Normalization for cDNA microarray data: a robust composite method addressing single and multiple slide systematic variation. Nucleic Acids Res. 30, e15.
Stafford, P. (2008) Methods in Microarray Normalization, Vol. 10, Taylor & Francis.
Bolstad, B. M., Irizarry, R. A., Astrand, M., and Speed, T. P. (2003) A comparison of normalization methods for high density oligonucleotide array data based on variance and bias. Bioinformatics. 19, 185–193.
Bourgon, R., Gentleman, R., and Huber, W. (2010) Independent filtering increases detection power for high-throughput experiments. Proc Natl Acad Sci USA. 107, 9546–9551.
Smyth, G. K., Gentleman, R., Carey, V., Dudoit, S., Irizarry, R., and Huber, W. (2005) Limma: linear models for microarray data, In Bioinformatics and Computational Biology Solutions using R and Bioconductor, pp 397–420, Springer.
Hochberg, Y., and Benjamini, Y. (1990) More powerful procedures for multiple significance testing. Stat Med. 9, 811–818.
Mootha, V. K., Lindgren, C. M., Eriksson, K. F., Subramanian, A., Sihag, S., Lehar, J., Puigserver, P., Carlsson, E., Ridderstrale, M., Laurila, E., Houstis, N., Daly, M. J., Patterson, N., Mesirov, J. P., Golub, T. R., Tamayo, P., Spiegelman, B., Lander, E. S., Hirschhorn, J. N., Altshuler, D., and Groop, L. C. (2003) PGC-1alpha-responsive genes involved in oxidative phosphorylation are coordinately downregulated in human diabetes. Nat Genet. 34, 267–273.
Oldham, M., Horvath, S., and Geschwind, D. (2006) Conservation and evolution of gene coexpression networks in human and chimpanzee brains. Proc Natl Acad Sci USA. 103, 17973–17978.
Miller, J. A., Horvath, S., and Geschwind, D. H. (2010) Divergence of human and mouse brain transcriptome highlights Alzheimer disease pathways. Proc Natl Acad Sci USA. 107, 12698–12703.
Johnson, M., Kawasawa, Y., Mason, C., Krsnik, Z., Coppola, G., Bogdanovi, D., Geschwind, D., Mane, S., State, M., and Sestan, N. (2009) Functional and Evolutionary Insights into Human Brain Development through Global Transcriptome Analysis. Neuron. 62, 494–509.
Oldham, M. C., Konopka, G., Iwamoto, K., Langfelder, P., Kato, T., Horvath, S., and Geschwind, D. H. (2008) Functional organization of the transcriptome in human brain. Nat Neurosci. 11, 1271–1282.
Winden, K. D., Oldham, M. C., Mirnics, K., Ebert, P. J., Swan, C. H., Levitt, P., Rubenstein, J. L., Horvath, S., and Geschwind, D. H. (2009) The organization of the transcriptional network in specific neuronal classes. Mol Syst Biol. 5, 291.
Horvath, S., Zhang, B., Carlson, M., Lu, K. V., Zhu, S., Felciano, R. M., Laurance, M. F., Zhao, W., Qi, S., Chen, Z., Lee, Y., Scheck, A. C., Liau, L. M., Wu, H., Geschwind, D. H., Febbo, P. G., Kornblum, H. I., Cloughesy, T. F., Nelson, S. F., and Mischel, P. S. (2006) Analysis of oncogenic signaling networks in glioblastoma identifies ASPM as a molecular target. Proc Natl Acad Sci USA. 103, 17402–17407.
Day, A., Carlson, M. R., Dong, J., O’Connor, B. D., and Nelson, S. F. (2007) Celsius: a community resource for Affymetrix microarray data. Genome Biol. 8, R112.
Chesler, E. J., Lu, L., Shou, S., Qu, Y., Gu, J., Wang, J., Hsu, H. C., Mountz, J. D., Baldwin, N. E., Langston, M. A., Threadgill, D. W., Manly, K. F., and Williams, R. W. (2005) Complex trait analysis of gene expression uncovers polygenic and pleiotropic networks that modulate nervous system function. Nat Genet. 37, 233–242.
Cox, J., and Mann, M. (2007) Is proteomics the new genomics? Cell. 130, 395–398.
Metzker, M. L. (2010) Sequencing technologies – the next generation. Nat Rev Genet. 11, 31–46.
Ng, S., Turner, E., Robertson, P., Flygare, S., Bigham, A., Lee, C., Shaffer, T., Wong, M., Bhattacharjee, A., Eichler, E., Bamshad, M., Nickerson, D., and Shendure, J. (2009) Targeted capture and massively parallel sequencing of 12 human exomes. Nature. 461, 272–276.
Biesecker, L., Mullikin, J., Facio, F., Turner, C., Cherukuri, P., Blakesley, R., Bouffard, G., Chines, P., Cruz, P., Hansen, N., Teer, J., Maskeri, B., Young, A., Manolio, T., Wilson, A., Finkel, T., Hwang, P., Arai, A., Remaley, A., Sachdev, V., Shamburek, R., Cannon, R., and Green, E. (2009) The ClinSeq Project: Piloting large-scale genome sequencing for research in genomic medicine. Genome Res. 19, 1665–1674.
Wang, Z., Gerstein, M., and Snyder, M. (2009) RNA-Seq: a revolutionary tool for transcriptomics. Nat Rev Genet. 10, 57–63.
Mortazavi, A., Williams, B., McCue, K., Schaeffer, L., and Wold, B. (2008) Mapping and quantifying mammalian transcriptomes by (RNA)-Seq. Nat Methods. 5, 621–628.
Pan, Q., Shai, O., Lee, L., Frey, B., and Blencowe, B. (2008) Deep surveying of alternative splicing complexity in the human transcriptome by high-throughput sequencing. Nat Genet. 40, 1413–1415.
Leek, J., Scharpf, R., Bravo, H. É. c., Simcha, D., Langmead, B., Johnson, E., Geman, D., Baggerly, K., and Irizarry, R. Tackling the widespread and critical impact of batch effects in high-throughput data. Nat Rev Genet. 11, 733–739.
Verdugo, R., Deschepper, C., Munoz, G., Pomp, D., and Churchill, G. (2009) Importance of randomization in microarray experimental designs with Illumina platforms. Nucleic Acids Res. 37, 5610–5618.
Shi, W., Banerjee, A., Ritchie, M., Gerondakis, S., and Smyth, G. (2009) Illumina WG-6 BeadChip strips should be normalized separately. BMC Bioinformatics. 10, 372.
Johnson, E., Li, C., and Rabinovic, A. (2007) Adjusting batch effects in microarray expression data using empirical Bayes methods. Biostat. 8, 118–127.
Rebhan, M., Chalifa-Caspi, V., Prilusky, J., and Lancet, D. (1997) GeneCards: integrating information about genes, proteins and diseases. Trends Genet. 13, 163.
Huang, D. W., Sherman, B., and Lempicki, R. (2009) Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat Protocols. 4, 44–57.
Zhang, B., Kirov, S., and Snoddy, J. (2005) WebGestalt: an integrated system for exploring gene sets in various biological contexts. Nucleic Acids Res. 33, W741-W748.
Ogata, H., Goto, S., Sato, K., Fujibuchi, W., Bono, H., and Kanehisa, M. (1999) KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids Res. 27, 29–34.
Chen, H., and Sharp, B. (2004) Content-rich biological network constructed by mining PubMed abstracts. BMC Bioinformatics. 5, 147.
Becker, K., Hosack, D., Dennis, G., Lempicki, R., Bright, T., Cheadle, C., and Engel, J. (2003) PubMatrix: a tool for multiplex literature mining. BMC Bioinformatics. 4, 61.
Hoffmann, R., and Valencia, A. (2004) A gene network for navigating the literature. Nat Genet. 36.
Chang, J., and Nevins, J. (2006) GATHER: a systems approach to interpreting genomic signatures. Bioinformatics. 22, 2926–2933.
Matys, V., Fricke, E., Geffers, R., Gössling, E., Haubrock, M., Hehl, R., Hornischer, K., Karas, D., Kel, A. E., Kel-Margoulis, O. V., Kloos, D. U., Land, S., Lewicki-Potapov, B., Michael, H., Münch, R., Reuter, I., Rotert, S., Saxel, H., Scheer, M., Thiele, S., and Wingender, E. (2003) TRANSFAC: transcriptional regulation, from patterns to profiles. Nucleic Acids Res. 31, 374–378.
Bailey, T. L., and Elkan, C. (1994) Fitting a mixture model by expectation maximization to discover motifs in biopolymers. Proceedings/International Conference on Intelligent Systems for Molecular Biology; ISMB. International Conference on Intelligent Systems for Molecular Biology. 2, 28–36.
Kent, J. (2002) BLAT – the BLAST-like alignment tool. Genome res. 12, 656–664.
Parkinson, H., Sarkans, U., Kolesnikov, N., Abeygunawardena, N., Burdett, T., Dylag, M., Emam, I., Farne, A., Hastings, E., Holloway, E., Kurbatova, N., Lukk, M., Malone, J., Mani, R., Pilicheva, E., Rustici, G., Sharma, A., Williams, E., Adamusiak, T., Brandizi, M., Sklyar, N., and Brazma, A. (2010) Array Express update – an archive of microarray and high-throughput sequencing-based functional genomics experiments. Nucleic Acids Res. 39, D1002–4.
Wang, J., Williams, R., and Manly, K. (2003) WebQTL: web-based complex trait analysis. Neuroinformatics. 1, 299–308.
Obayashi, T., and Kinoshita, K. (2011) COXPRESdb: a database to compare gene coexpression in seven model animals. Nucleic Acids Res. in press.
Acknowledgments
The author would like to thank Fuying Gao and Jeremy Davis-Turak for technical assistance, and Drs. Michael Oldham and Daniel Geschwind for critically reading the manuscript.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer Science+Business Media, LLC
About this protocol
Cite this protocol
Coppola, G. (2011). Designing, Performing, and Interpreting a Microarray-Based Gene Expression Study. In: Manfredi, G., Kawamata, H. (eds) Neurodegeneration. Methods in Molecular Biology, vol 793. Humana Press, Totowa, NJ. https://doi.org/10.1007/978-1-61779-328-8_28
Download citation
DOI: https://doi.org/10.1007/978-1-61779-328-8_28
Published:
Publisher Name: Humana Press, Totowa, NJ
Print ISBN: 978-1-61779-327-1
Online ISBN: 978-1-61779-328-8
eBook Packages: Springer Protocols