Skip to main content
Other
Multiple Testing Procedures and Applications to Genomics
U.C. Berkeley Division of Biostatistics Working Paper Series
  • Merrill D. Birkner, Division of Biostatistics, School of Public Health, University of California, Berkeley
  • Katherine S. Pollard, Center for Molecular Science & Engineering, University of California, Santa Cruz
  • Mark J. van der Laan, Division of Biostatistics, School of Public Health, University of California, Berkeley
  • Sandrine Dudoit, Division of Biostatistics, School of Public Health, University of California, Berkeley
Date of this Version
1-5-2005
Comments
Published in Bioinformatics and Computational Biology Solutions Using R and Bioconductor, Springer, 2005.
Abstract

This chapter proposes widely applicable resampling-based single-step and stepwise multiple testing procedures (MTP) for controlling a broad class of Type I error rates, in testing problems involving general data generating distributions (with arbitrary dependence structures among variables), null hypotheses, and test statistics (Dudoit and van der Laan, 2005; Dudoit et al., 2004a,b; van der Laan et al., 2004a,b; Pollard and van der Laan, 2004; Pollard et al., 2005). Procedures are provided to control Type I error rates defined as tail probabilities for arbitrary functions of the numbers of Type I errors, V_n, and rejected hypotheses, R_n. These error rates include: the generalized family-wise error rate, gFWER(k) = Pr(V_n > k), or chance of at least (k+1) false positives (the special case k=0 corresponds to the usual family-wise error rate, FWER), and tail probabilities for the proportion of false positives among the rejected hypotheses, TPPFP(q) = Pr(V_n/R_n > q). Single-step and step-down common-cut-off (maxT) and common-quantile (minP) procedures, that take into account the joint distribution of the test statistics, are proposed to control the FWER. In addition, augmentation multiple testing procedures are provided to control the gFWER and TPPFP, based on any initial FWER-controlling procedure. The results of a multiple testing procedure can be summarized using rejection regions for the test statistics, confidence regions for the parameters of interest, or adjusted p-values. A key ingredient of our proposed MTPs is the test statistics null distribution (and consistent bootstrap estimator thereof) used to derive rejection regions and corresponding confidence regions and adjusted p-values. This chapter illustrates an implementation in SAS (Version 9) of the bootstrap-based single-step maxT procedure and of the gFWER- and TPPFP-controlling augmentation procedures. These multiple testing procedures are applied to an HIV-1 sequence dataset to identify codon positions associated with viral replication capacity.

Citation Information
Merrill D. Birkner, Katherine S. Pollard, Mark J. van der Laan and Sandrine Dudoit. "Multiple Testing Procedures and Applications to Genomics" (2005)
Available at: http://works.bepress.com/mark_van_der_laan/112/