Skip to main content
Article
Correlation Between Gene Expression Levels and Limitations of the Empirical Bayes Methodology for Finding Differentially Expressed Genes
Statistical Applications in Genetics and Molecular Biology (2010)
  • Xing Qiu
  • Lev Klebanov
  • Andrei Yakovlev
Abstract

Stochastic dependence between gene expression levels in microarray data is of critical importance for the methods of statistical inference that resort to pooling test statistics across genes. The empirical Bayes methodology in the nonparametric and parametric formulations, as well as closely related methods employing a two-component mixture model, represent typical examples. It is frequently assumed that dependence between gene expressions (or associated test statistics) is sufficiently weak to justify the application of such methods for selecting differentially expressed genes. By applying resampling techniques to simulated and real biological data sets, we have studied a potential impact of the correlation between gene expression levels on the statistical inference based on the empirical Bayes methodology. We report evidence from these analyses that this impact may be quite strong, leading to a high variance of the number of differentially expressed genes. This study also pinpoints specific components of the empirical Bayes method where the reported effect manifests itself.

Keywords
  • microarray analysis,
  • gene expression,
  • two-sample tests,
  • empirical Bayes method,
  • correlated data,
  • resampling techniques
Publication Date
March 31, 2010
Citation Information
Xing Qiu, Lev Klebanov and Andrei Yakovlev. "Correlation Between Gene Expression Levels and Limitations of the Empirical Bayes Methodology for Finding Differentially Expressed Genes" Statistical Applications in Genetics and Molecular Biology Vol. 4 Iss. 1 (2010)
Available at: http://works.bepress.com/lev_klebanov/1/