Hello. Sign in to personalize your visit. New user? Register now.  
Journal of Computational Biology
Statistical Development and Evaluation of Microarray Gene Expression Data Filters

To cite this article:
Stan Pounds, Cheng Cheng. Journal of Computational Biology. May 2005, 12(4): 482-495. doi:10.1089/cmb.2005.12.482.

Full Text: • PDF for printing (375.7 KB) • PDF w/ links (414.5 KB)


Stan Pounds
Department of Biostatistics, St. Jude Children's Research Hospital, Memphis, TN 38105-2794.
Cheng Cheng
Department of Biostatistics, St. Jude Children's Research Hospital, Memphis, TN 38105-2794.

Filtering is a common practice used to simplify the analysis of microarray data by removing from subsequent consideration probe sets believed to be unexpressed. The m/n filter, which is widely used in the analysis of Affymetrix data, removes all probe sets having fewer than m present calls among a set of n chips. The m/n filter has been widely used without considering its statistical properties. The level and power of the m/n filter are derived. Two alternative filters, the pooled p-value filter and the error-minimizing pooled p-value filter are proposed. The pooled p-value filter combines information from the present–absent p-values into a single summary p-value which is subsequently compared to a selected significance threshold. We show that the pooled p-value filter is the uniformly most powerful statistical test under a reasonable beta model and that it exhibits greater power than the m/n filter in all scenarios considered in a simulation study. The error-minimizing pooled p-value filter compares the summary p-value with a threshold determined to minimize a total-error criterion based on a partition of the distribution of all probes' summary p-values. The pooled p-value and error-minimizing pooled p-value filters clearly perform better than the m/n filter in a case-study analysis. The case-study analysis also demonstrates a proposed method for estimating the number of differentially expressed probe sets excluded by filtering and subsequent impact on the final analysis. The filter impact analysis shows that the use of even the best filter may hinder, rather than enhance, the ability to discover interesting probe sets or genes. S-plus and R routines to implement the pooled p-value and error-minimizing pooled p-value filters have been developed and are available from www.stjuderesearch.org/depts/biostats/index.html.

Free first page

This paper was cited by:

Reference alignment of SNP microarray signals for copy number analysis of tumors
S. Pounds, C. Cheng, C. Mullighan, S. C. Raimondi, S. Shurtleff, J. R. Downing
Bioinformatics. Mar 2009, Vol. 25, No. 3: 315-321
CrossRef
The effect of insulin on expression of genes and biochemical pathways in human skeletal muscle
Xuxia Wu, Jelai Wang, Xiangqin Cui, Lidia Maianu, Brian Rhees, James Rosinski, W. Venus So, Steven M. Willi, Michael V. Osier, Helliner S. Hill, Grier P. Page, David B. Allison, Mitchell Martin, W. Timothy Garvey
Endocrine. Jul 2007, Vol. 31, No. 1: 5-17
CrossRef
Review of the literature examining the correlation among DNA microarray technologies
Carole L. Yauk, M. Lynn Berndt
Environmental and Molecular Mutagenesis. Jul 2007, Vol. 48, No. 5: 380-394
CrossRef
A sequence-oriented comparison of gene expression measurements across different hybridization-based technologies
Winston Patrick Kuo, Fang Liu, Jeff Trimarchi, Claudio Punzo, Michael Lombardi, Jasjit Sarang, Mark E Whipple, Malini Maysuria, Kyle Serikawa, Sun Young Lee, Donald McCrann, Jason Kang, Jeffrey R Shearstone, Jocelyn Burke, Daniel J Park, Xiaowei Wang, Trent L Rector, Paola Ricciardi-Castagnoli, Steven Perrin, Sangdun Choi, Roger Bumgarner, Ju Han Kim, Glenn F Short, Mason W Freeman, Brian Seed, Roderick Jensen, George M Church, Eivind Hovig, Connie L Cepko, Peter Park, Lucila Ohno-Machado, Tor-Kristian Jenssen
Nature Biotechnology. Aug 2006, Vol. 24, No. 7: 832-840
CrossRef
Microarray data analysis: from disarray to consolidation and consensus
David B. Allison, Xiangqin Cui, Grier P. Page, Mahyar Sabripour
Nature Reviews Genetics. Feb 2006, Vol. 7, No. 1: 55-65
CrossRef
All articles
Previous Next