Contributions to Books «Previous Next»

Alternative Probeset Definitions for Combining Microarray Data Across Studies Using Different Versions of Affymetrix Oligonucleotide Arrays

Jeffrey S. Morris, The University of Texas M.D. Anderson Cancer Center
Chunlei Wu, The University of Texas M.D. Anderson Cancer Center
Kevin R. Coombes, The University of Texas M.D. Anderson Cancer Center
Keith A. Baggerly, The University of Texas M.D. Anderson Cancer Center
Jing Wang, The University of Texas M.D. Anderson Cancer Center
Li Zhang, The University of Texas M.D. Anderson Cancer Center

Abstract

Many published microarray studies have small to moderate sample sizes, and thus have low statistical power to detect significant relationships between gene expression levels and outcomes of interest. By pooling data across multiple studies, however, we can gain power, enabling us to detect new relationships. This type of pooling is complicated by the fact that gene expression measurements from different microarray platforms are not directly comparable. In this chapter, we discuss two methods for combining information across different versions of Affymetrix oligonucleotide arrays. Each involves a new approach for combining probes on the array into probesets. The first approach involves identifying ”matching probes” present on both chips, and then assembling them into new probesets based on Unigene clusters. We demonstrate that this method yields comparable expression level quantifications across chips without sacrificing much precision or significantly altering the relative ordering of the samples. We applied this method to combine information across two lung cancer studies performed using the HuGeneFL and U95Av2 chips, revealing some genes related to patient survival. It appears that the gain in statistical power from the pooling was key to identifying many of these genes, since most were not found by equivalent analyses performed separately on the two data sets. We have found that this approach is not feasible for combining information across the U95Av2 and U133A chips, which share fewer probes in common. Our second method defines probesets as sets of probes matching the same full-length mRNA transcripts in current genomic databases. We found this method yielded comparable expression levels across U95Av2 and U133A chip types, and had better correlation across chip types than Affymetrix’s matching probeset definitions.

Suggested Citation

Jeffrey S. Morris, Chunlei Wu, Kevin R. Coombes, Keith A. Baggerly, Jing Wang, and Li Zhang. "Alternative Probeset Definitions for Combining Microarray Data Across Studies Using Different Versions of Affymetrix Oligonucleotide Arrays" Meta-Analysis in Genetics. Ed. Rudy Guerra and David Allison. New York: Chapman-Hall, 2006.
Available at: http://works.bepress.com/jeffrey_s_morris/17