Articles «Previous Next»

Differential Expression in SAGE: accounting for normal between-library variation

Keith A. Baggerly, The University of Texas M.D. Anderson Cancer Center
Li Deng, Rice University
Jeffrey S. Morris, The University of Texas M.D. Anderson Cancer Center
C. Marcelo Aldez, The University of Texas M.D. Anderson Cancer Center

Abstract

Motivation: In contrasting levels of gene expression between groups of SAGE libraries, the libraries within each group are often combined and the counts for the tag of interest summed, and inference is made on the basis of these larger ‘pseudolibraries’. While this captures the sampling variability inherent in the procedure, it fails to allow for normal variation in levels of the gene between individuals within the same group, and can consequently overstate the significance of the results. The effect is not slight: between-library variation can be hundreds of times the within-library variation.

Results: We introduce a beta-binomial sampling model that correctly incorporates both sources of variation. We show how to fit the parameters of this model, and introduce a test statistic for differential expression similar to a twosample t-test.

Contact: kabagg@mdanderson.org

Supplementary information: http://bioinformatics. mdanderson.org/ Includes Matlab and R code for fitting the model.

Suggested Citation

Keith A. Baggerly, Li Deng, Jeffrey S. Morris, and C. Marcelo Aldez. "Differential Expression in SAGE: accounting for normal between-library variation" Bioinformatics 19.12 (2003): 1477-1483.