Sandrine Dudoit is Associate Professor of Biostatistics and Statistics at the University of California, Berkeley. Professor Dudoit's research and teaching activities concern the development and application of statistical and computational methods to address problems in biomedical and genomic research. Specific areas of interest include: * the design and analysis of high-throughput gene expression experiments (e.g., cDNA microarrays, alternative splicing microarrays, ChIP-Chip, metagenomics microarrays); * nucleotide and protein sequence analysis (e.g., identification of regulatory motifs in DNA sequences); * the genetic mapping of complex traits (e.g., IBD-based linkage analysis, linkage disequilibrium analysis, SNP-based association studies, microarray-based genetic mapping studies of gene expression); * the analysis of biological annotation metadata (e.g., Gene Ontology (GO) annotation). Her methodological research interests include: * loss-based estimation with cross-validation: parametric and non-parametric density estimation and regression, variable selection; * multiple hypothesis testing: resampling-based multiple testing procedures for controlling generalized Type I error rates, defined as tail probabilities and expected values for arbitrary functions of the numbers of Type I errors and rejected hypotheses (e.g., false discovery rate). Professor Dudoit is also involved in the development of statistical software for biomedical and genomic data analysis and is a core member of the Bioconductor Project (www.bioconductor.org). Professor Dudoit obtained a Bachelor's (1992) and Master's (1994) degree in Mathematics from Carleton University, Ottawa, Canada. She first came to UC Berkeley as a graduate student and earned a PhD degree in 1999 from the Department of Statistics. Her doctoral research, under the supervision of Professor Terence P. Speed, concerned the linkage analysis of complex human traits. From 1999 to 2000, she was a postdoctoral fellow at the Mathematical Sciences Research Institute, Berkeley. Before joining the Faculty at UC Berkeley in July 2001, she underwent a year of postdoctoral training in genomics in the laboratory of Professor Patrick O. Brown, Department of Biochemistry, Stanford University. Her work in the Brown Lab involved the development of statistical and computational methods for the design and analysis of gene expression experiments using DNA microarrays.
Biological Annotation Metadata Analysis
Multiple Tests of Association with Biological Annotation Metadata (with Sunduz Keles and Mark J. van der Laan), U.C. Berkeley Division of Biostatistics Working Paper Series (2006)
We propose a general and formal statistical framework for the multiple tests of associations between...
Biological Sequence Analysis
Supervised Detection of Regulatory Motifs in DNA Sequences (with Sunduz Keles, Mark J. van der Laan, Sandrine Dudoit, Biao Xing, and Michael B. Eisen ), Statistical Applications in Genetics and Molecular Biology (2003)
Identification of transcription factor binding sites (regulatory motifs) is a major interest in contemporary biology....
Supervised Detection of Regulatory Motifs in DNA Sequences (with Sunduz Keles, Mark J. van der Laan, Biao Xing, and Michael B. Eisen), U.C. Berkeley Division of Biostatistics Working Paper Series (2003)
Identification of transcription factor binding sites (regulatory motifs) is a major interest in contemporary...
Genetic Mapping
A Fine-Scale Linkage-Disequilibrium Measure Based on Length of Haplotype Sharing (with Yan Wang and Lue Ping Zhao), The American Journal of Human Genetics (2006)
High-throughput genotyping technologies for SNPs have enabled the recent completion of the International HapMap Project...
A Fine-Scale Linkage Disequilibrium Measure Based on Length of Haplotype Sharing (with Yan Wang and Lue Ping Zhao), U.C. Berkeley Division of Biostatistics Working Paper Series (2005)
High-throughput genotyping technologies for single nucleotide polymorphisms (SNP) have enabled the recent completion of the...
Quantification and Visualization of LD Patterns and Identification of Haplotype Blocks (with Yan Wang), U.C. Berkeley Division of Biostatistics Working Paper Series (2004)
Classical measures of linkage disequilibrium (LD) between two loci, based only on the joint distribution...
IBD Configuration Transition Matrices and Linkage Score Tests for Unilineal Relative Pairs, U.C. Berkeley Division of Biostatistics Working Paper Series (2003)
Properties of transition matrices between IBD configurations are derived for four general classes of unilineal...
Loss-Based Estimation with Cross-Validation
A deletion/substitution/addition algorithm for classification neural networks, with applications to biomedical data (with Blythe Durbin and Mark J. van der Laan), Journal of Statistical Planning and Inference (2008)
Neural networks are a popular machine learning tool, particularly in applications such as protein structure...
Loss-based estimation with evolutionary algorithms and cross-validation (with David Shilane and Richard H. Liang), U.C. Berkeley Division of Biostatistics Working Paper Series (2007)
Many statistical inference methods rely upon selection procedures to estimate a parameter of the joint...
Survival Ensembles (with Torsten Hothorn, Peter Buhlmann, Annette M. Molinaro, and Mark J. van der Laan), Biostatistics (2006)
We propose a unified and flexible framework for ensemble learning in the presence of censoring....
Oracle inequalities for multi-fold cross validation (with Aad W. van der Vaart and Mark J. van der Laan), Statistics & Decisions (2006)
We consider choosing an estimator or model from a given class by cross validation consisting...
The cross-validated adaptive epsilon-net estimator (with Mark J. van der Laan and Aad W. van der Vaart), Statistics & Decisions (2006)
Suppose that we observe a sample of independent and identically distributed realizations of a random...
Microarray Data Analysis
Prognosis of stage II colon cancer by non-neoplastic mucosa gene expression profiling (with A. Barrier, F. Roser, P-Y. Boelle, B. Franc, C. Tse, D. Brault, F. Lacaine, S. Houry, P. Callard, C. Penna, B. Debuire, A. Flahault, and A. Lemoine), Oncogene (2007)
We have assessed the possibility to build a prognosis predictor (PP), based on non-neoplastic mucosa...
Stage II Colon Cancer Prognosis Prediction by Tumor Gene Expression Profiling (with Alain Barrier, Pierre-Yves Boelle, François Roser, Jennifer Gregg, Chantal Tse, Didier Brault, François Lacaine, Sidney Houry, Michel Huguier, Brigitte Franc, Antoine Flahault, and Antoinette Lemoine), Journal of Clinical Oncology (2006)
PURPOSE: This study mainly aimed to identify and assess the performance of a microarray-based prognosis...
Multiple Testing Methods For ChIP–Chip High Density Oligonucleotide Array Data (with Sündüz Keleş, Mark J. van der Laan, and Simon E. Cawley), Journal of Computational Biology (2006)
Cawley et al. (2004) have recently mapped the locations of binding sites for three transcription...
Exploration of global gene expression in human liver steatosis by high-density oligonucleotide microarray (with Frank Chiappini, Alain Barrier, Raphaël Saffroy, Marie-Charlotte Domart, Nicolas Dagues, Daniel Azoulay, Mylène Sebagh, Brigitte Franc, Stephan Chevalier, Brigitte Debuire, and Antoinette Lemoine), Laboratory Investigation (2005)
Understanding the molecular mechanisms underlying fatty liver disease (FLD) in humans is of major importance....
Gene expression profiling of nonneoplastic mucosa may predict clinical outcome of colon cancer patients (with Alain Barrier, Pierre-Yves Boelle, Antoinette Lemoine, Chantal Tse, Didier Brault, Frank Chiappini, François Lacaine, Sidney Houry, Michel Huguier, and Antoine Flahault), Diseases of the Colon and Rectum (2005)
PURPOSE This study assessed the possibility to build a prognosis predictor, based on microarray gene...
Miscellaneous
A General Framework for Statistical Performance Comparison of Evolutionary Computation Algorithms (with David Shilane, Jarno Martikainen, and Seppo Ovaska), U.C. Berkeley Division of Biostatistics Working Paper Series (2006)
This paper proposes a statistical methodology for comparing the performance of evolutionary computation algorithms. A...
Multiple Hypothesis Testing
Resampling-based empirical Bayes multiple testing procedures for controlling generalized tail probability and expected value error rates: Focus on the false discovery rate and simulation stud (with Houston N. Gilbert and Mark J. van der Laan), U.C. Berkeley Division of Biostatistics Working Paper Series (2007)
This article proposes resampling-based empirical Bayes multiple testing procedures for controlling a broad class of...
A Method to Increase the Power of Multiple Testing Procedures Through Sample Splitting (with Daniel Rubin and Mark van der Laan), Statistical Applications in Genetics and Molecular Biology (2006)
Consider the standard multiple testing problem where many hypotheses are to be tested, each hypothesis...
A Method to Increase the Power of Multiple Testing Procedures Through Sample Splitting (with Daniel Rubin, Sandrine Dudoit, and Mark J. van der Laan), U.C. Berkeley Division of Biostatistics Working Paper Series (2006)
Consider the standard multiple testing problem where many hypotheses are to be tested, each hypothesis...
Test statistics null distributions in multiple testing: Simulation studies and applications to genomics (with Katherine S. Pollard, Merrill D. Birkner, and Mark J. van der Laan), Journal de la Société Française de Statistique (2005)
Multiple hypothesis testing problems arise frequently in biomedical and genomic research, for instance, when identifying...
Test Statistics Null Distributions in Multiple Testing: Simulation Studies and Applications to Genomics (with Katherine S. Pollard, Merrill D. Birkner, and Mark J. van der Laan), U.C. Berkeley Division of Biostatistics Working Paper Series (2005)
Multiple hypothesis testing problems arise frequently in biomedical and genomic research, for instance, when identifying...
Statistical Computing
Multiple Testing Procedures: R multtest Package and Applications to Genomics (with Katherine S. Pollard and Mark J. van der Laan), U.C. Berkeley Division of Biostatistics Working Paper Series (2004)
The Bioconductor R package multtest implements widely applicable resampling-based single-step and stepwise multiple testing procedures...
Bioconductor: open software development for computational biology and bioinformatics (with Robert C. Gentleman, Vincent J. Carey, Douglas M. Bates, Ben Bolstad, Marcel Dettling, Byron Ellis, Laurent Gautier, Yongchao Ge, Jeff Gentry, Kurt Hornik, Torsten Hothorn, Wolfgang Huber, Stefano Iacus, Rafael Irizarry, Friedrich Leisch, Cheng Li, Martin Maechler, Anthony J. Rossini, Gunther Sawitzki, Colin Smith, Gordon Smyth, Luke Tierney, Jean Y. H. Yang, and Jianhua Zhang), Genome Biology (2004)
The Bioconductor project is an initiative for the collaborative creation of extensible software for computational...
Bioconductor: Open software development for computational biology and bioinformatics (with Robert C. Gentleman, Vincent J. Carey, Douglas J. Bates, Benjamin M. Bolstad, Marcel Dettling, Byron Ellis, Laurent Gautier, Yongchao Ge, Jeff Gentry, Kurt Hornik, Torsten Hothorn, Wolfgang Huber, Stefano Iacus, Rafael Irizarry, Friedrich Leisch, Cheng Li, Martin Maechler, Anthony J. Rossini, Guenther Sawitzki, Colin Smith, Gordon K. Smyth, Luke Tierney, Yee Hwa Yang, and Jianhua Zhang), Bioconductor Project Working Papers (2004)
The Bioconductor project is an initiative for the collaborative creation of extensible software for computational...