Our research involves developing statistical methods and theories for the analysis
of data as commonly arise in randomized controlled trials and observational studies. In
particular, we are concerned with methods dealing in proper ways with informative
censoring, confounding, the curse of dimensionality, multiple testing, and data adaptive
selection of models. Our phylosophy is targeted learning, formalized by our recent work
on targeted maximum likelihood learning, and unified loss based learning. This
statistical approach aims to let the data speak for the purpose of answering a particular
scientific question of interest, and provide robust tests of null hypotheses of interest.
We are continuously concerned with bringing these methods into practice and benchmark
them by the practical performance on simulated and real data.

Biological Annotation Metadata Analysis

Link

Multiple Tests of Association with Biological Annotation Metadata (with Sandrine Dudoit and Sunduz Keles), U.C. Berkeley Division of Biostatistics Working Paper Series (2006)

We propose a general and formal statistical framework for the multiple tests of associations between...

 

Biological Sequence Analysis

Link

Supervised Detection of Regulatory Motifs in DNA Sequences (with Sunduz Keles, Sandrine Dudoit, Biao Xing, and Michael B. Eisen ), Statistical Applications in Genetics and Molecular Biology (2006)
Identification of transcription factor binding sites (regulatory motifs) is a major interest in contemporary biology....
 

Link

Supervised Detection of Regulatory Motifs in DNA Sequences (with Sunduz Keles, Sandrine Dudoit, Biao Xing, and Michael B. Eisen), U.C. Berkeley Division of Biostatistics Working Paper Series (2003)
Identification of transcription factor binding sites (regulatory motifs) is a major interest in contemporary...
 

Categorical Data Analysis

PDF

Empirical Bayes and Resampling Based Multiple Testing Procedure Controlling Tail Probability of the Proportion of False Positives. (with Merrill D. Birkner and Alan E. Hubbard), Statistical Applications in Genetics and Molecular Biology (2006)
Simultaneously testing a collection of null hypotheses about a data generating distribution based on a...
 

PDF

Choice of Monitoring Mechanism for Optimal Nonparametric Functional Estimation for Binary Data (with Nicholas P. Jewell and Stephen Shiboski), U.C. Berkeley Division of Biostatistics Working Paper Series (2004)
Optimal designs of dose levels in order to estimate parameters from a model for binary...
 

Link

Choice of Monitoring Mechanism for Optimal Nonparametric Functional Estimation for Binary Data (with Nicholas P. Jewell and Stephen Shiboski), U.C. Berkeley Division of Biostatistics Working Paper Series (2004)
Optimal designs of dose levels in order to estimate parameters from a model for binary...
 

Link

Estimation of Treatment Effects in Randomized Trials with Noncompliance and a Dichotomous Outcome (with Alan E. Hubbard and Nicholas P. Jewell), U.C. Berkeley Division of Biostatistics Working Paper Series (2004)
We propose a class of estimators of the treatment effect on a dichotomous outcome among...
 

PDF

Estimation of Treatment Effects in Randomized Trials with Noncompliance and a Dichotomous Outcome (with Alan E. Hubbard and Nicholas P. Jewell), U.C. Berkeley Division of Biostatistics Working Paper Series (2004)
We propose a class of estimators of the treatment effect on a dichotomous outcome among...
 

Clinical Epidemiology

PDF

History-Adjusted Marginal Structural Models and Statically-Optimal Dynamic Treatment Regimens (with Maya L. Petersen and Marshall M. Joffe), The International Journal of Biostatistics (2006)
Marginal structural models (MSM) provide a powerful tool for estimating the causal effect of a...
 

Clinical Trials

Link

Estimation of Treatment Effects in Randomized Trials with Noncompliance and a Dichotomous Outcome (with Alan E. Hubbard and Nicholas P. Jewell), U.C. Berkeley Division of Biostatistics Working Paper Series (2004)
We propose a class of estimators of the treatment effect on a dichotomous outcome among...
 

PDF

Estimation of Treatment Effects in Randomized Trials with Noncompliance and a Dichotomous Outcome (with Alan E. Hubbard and Nicholas P. Jewell), U.C. Berkeley Division of Biostatistics Working Paper Series (2004)
We propose a class of estimators of the treatment effect on a dichotomous outcome among...
 

PDF

Estimation of Direct and Indirect Causal Effects in Longitudinal Studies (with Maya L. Petersen), U.C. Berkeley Division of Biostatistics Working Paper Series (2004)
The causal effect of a treatment on an outcome is generally mediated by several intermediate...
 

PDF

Comparison of the Inverse Probability of Treatment Weighted (IPTW) Estimator With a Naïve Estimator in the Analysis of Longitudinal Data With Time-Dependent Confounding: A Simulation Study (with Thaddeus Haight, Romain Neugebauer, and Ira B. Tager), U.C. Berkeley Division of Biostatistics Working Paper Series (2003)
A simulation study was conducted to compare estimates from a naïve estimator, using standard conditional...
 

PDF

Measuring Treatment Effects Using Semiparametric Models (with Zhuo Yu), U.C. Berkeley Division of Biostatistics Working Paper Series (2003)
In order to estimate the causal effect of treatments on an outcome of interest, one...
 

Computation

PDF

Empirical Bayes and Resampling Based Multiple Testing Procedure Controlling Tail Probability of the Proportion of False Positives. (with Merrill D. Birkner and Alan E. Hubbard), Statistical Applications in Genetics and Molecular Biology (2006)
Simultaneously testing a collection of null hypotheses about a data generating distribution based on a...
 

PDF

Cluster Analysis of Genomic Data with Applications in R (with Katherine S. Pollard), U.C. Berkeley Division of Biostatistics Working Paper Series (2005)
In this paper, we provide an overview of existing partitioning and hierarchical clustering algorithms in...
 

PDF

Multiple Testing Procedures and Applications to Genomics (with Merrill D. Birkner, Katherine S. Pollard, and Sandrine Dudoit), U.C. Berkeley Division of Biostatistics Working Paper Series (2005)
This chapter proposes widely applicable resampling-based single-step and stepwise multiple testing procedures (MTP) for controlling...
 

PDF

Data Adaptive Estimation of the Treatment Specific Mean (with Yue Wang and Oliver Bembom), U.C. Berkeley Division of Biostatistics Working Paper Series (2004)
An important problem in epidemiology and medical research is the estimation of the causal effect...
 

PDF

History-Adjusted Marginal Structural Models and Statically-Optimal Dynamic Treatment Regimes (with Maya L. Petersen), U.C. Berkeley Division of Biostatistics Working Paper Series (2004)
Marginal structural models (MSM) provide a powerful tool for estimating the causal effect of a...
 

Computational Biology/Bioinformatics

PDF

Supervised Detection of Regulatory Motifs in DNA Sequences (with Sunduz Keles, Sandrine Dudoit, Biao Xing, and Michael B. Eisen ), Statistical Applications in Genetics and Molecular Biology (2006)
Identification of transcription factor binding sites (regulatory motifs) is a major interest in contemporary biology....
 

PDF

Multiple Testing and Data Adaptive Regression: An Application to HIV-1 Sequence Data. (with Merrill D. Birkner and Sandra E. Sinisi), Statistical Applications in Genetics and Molecular Biology (2006)
Analysis of viral strand sequence data and viral replication capacity could potentially lead to biological...
 

PDF

Cluster Analysis of Genomic Data with Applications in R (with Katherine S. Pollard), U.C. Berkeley Division of Biostatistics Working Paper Series (2005)
In this paper, we provide an overview of existing partitioning and hierarchical clustering algorithms in...
 

PDF

Statistical Inference for Simultaneous Clustering of Gene Expression Data (with Katherine S. Pollard), U.C. Berkeley Division of Biostatistics Working Paper Series (2001)
Current methods for analysis of gene expression data are mostly based on clustering and classification...
 

Link

Paired and Unpaired Comparisons and Clustering with Gene Expression Data (with Jennifer F. Bryan and Katherine S. Pollard), U.C. Berkeley Division of Biostatistics Working Paper Series (2001)
We have previously described a statistical framework for using gene expression data from cDNA microarrays...
 

Design of Experiments and Sample Surveys

Link

Gene Expression Analysis with the Parametric Bootstrap (with Jennifer F. Bryan), U.C. Berkeley Division of Biostatistics Working Paper Series (2000)
Recent developments in microarray technology make it possible to capture the gene expression profiles for...
 

Disease Modeling

PDF

Multiple Testing and Data Adaptive Regression: An Application to HIV-1 Sequence Data. (with Merrill D. Birkner and Sandra E. Sinisi), Statistical Applications in Genetics and Molecular Biology (2006)
Analysis of viral strand sequence data and viral replication capacity could potentially lead to biological...
 

PDF

Colon Cancer Prognosis Prediction by Gene Expression Profiling (with Alain Barrier and Sandrine Dudoit), U.C. Berkeley Division of Biostatistics Working Paper Series (2005)
Aims. This study assessed the possibility to build a prognosis predictor, based on microarray gene...
 

PDF

Prognosis of Stage II Colon Cancer by Non-Neoplastic Mucosa Gene Expresssion Profiling (with Alain Barrier and Sandrine Dudoit), U.C. Berkeley Division of Biostatistics Working Paper Series (2005)
Aims. This study assessed the possibility to build a prognosis predictor, based on non-neoplastic mucosa...
 

PDF

Multiple Testing and Data Adaptive Regression: An Application to HIV-1 Sequence Data (with Merrill D. Birkner and Sandra E. Sinisi), U.C. Berkeley Division of Biostatistics Working Paper Series (2004)
Analysis of viral strand sequence data and viral replication capacity could potentially lead to biological...
 

PDF

Comparative Genomic Hybridization Array Analysis (with Annette M. Molinaro and Dan H. Moore), U.C. Berkeley Division of Biostatistics Working Paper Series (2002)

At the present time, there is increasing evidence that cancer may be regulated by the...

 

Epidemiology

PDF

History-Adjusted Marginal Structural Models and Statically-Optimal Dynamic Treatment Regimens (with Maya L. Petersen and Marshall M. Joffe), The International Journal of Biostatistics (2006)
Marginal structural models (MSM) provide a powerful tool for estimating the causal effect of a...
 

PDF

Extending Marginal Structural Models through Local, Penalized, and Additive Learning (with Daniel Rubin), U.C. Berkeley Division of Biostatistics Working Paper Series (2006)

Marginal structural models (MSMs) allow one to form causal inferences from data, by specifying a...

 

PDF

History-Adjusted Marginal Structural Models to Estimate Time-Varying Effect Modification (with Maya L. Petersen, Steven G. Deeks, and Jeffrey N. Martin), U.C. Berkeley Division of Biostatistics Working Paper Series (2005)
Much of epidemiology and clinical medicine is focused on the estimation of treatments or interventions...
 

PDF

Population Intervention Models in Causal Inference (with Alan E. Hubbard), U.C. Berkeley Division of Biostatistics Working Paper Series (2005)
Marginal structural models (MSM) provide a powerful tool for estimating the causal effect of a]...
 

PDF

Estimation of Direct Causal Effects (with Maya L. Petersen), U.C. Berkeley Division of Biostatistics Working Paper Series (2005)
Many common problems in epidemiologic and clinical research involve estimating the effect of an exposure...
 

General Biostatistics

PDF

Statistical Inference for Variable Importance, The International Journal of Biostatistics (2006)
Many statistical problems involve the learning of an importance/effect of a variable for predicting an...
 

PDF

Issues of Processing and Multiple Testing of SELDI-TOF MS Proteomic Data (with Merrill D. Birkner, Alan E. Hubbard, Christine F. Skibola, Christine M. Hegedus, and Martyn T. Smith), Statistical Applications in Genetics and Molecular Biology (2006)
A new data filtering method for SELDI-TOF MS proteomic spectra data is described. We examined...
 

PDF

History-Adjusted Marginal Structural Models and Statically-Optimal Dynamic Treatment Regimens (with Maya L. Petersen and Marshall M. Joffe), The International Journal of Biostatistics (2006)
Marginal structural models (MSM) provide a powerful tool for estimating the causal effect of a...
 

PDF

Augmentation Procedures for Control of the Generalized Family-Wise Error Rate and Tail Probabilities for the Proportion of False Positives (with Sandrine Dudoit and Katherine S. Pollard), Statistical Applications in Genetics and Molecular Biology (2006)
This article shows that any single-step or stepwise multiple testing procedure (asymptotically) controlling the family-wise...
 

PDF

Multiple Tests of Association with Biological Annotation Metadata (with Sandrine Dudoit and Sunduz Keles), U.C. Berkeley Division of Biostatistics Working Paper Series (2006)

We propose a general and formal statistical framework for the multiple tests of associations between...

 

Genetics

PDF

Multiple Tests of Association with Biological Annotation Metadata (with Sandrine Dudoit and Sunduz Keles), U.C. Berkeley Division of Biostatistics Working Paper Series (2006)

We propose a general and formal statistical framework for the multiple tests of associations between...

 

Human Genetics

PDF

Application of a Multiple Testing Procedure Controlling the Proportion of False Positives to Protein and Bacterial Data (with Merrill D. Birkner and Alan E. Hubbard), U.C. Berkeley Division of Biostatistics Working Paper Series (2005)
Simultaneously testing multiple hypotheses is important in high-dimensional biological studies. In these situations, one is...
 

PDF

Prognosis of Stage II Colon Cancer by Non-Neoplastic Mucosa Gene Expresssion Profiling (with Alain Barrier and Sandrine Dudoit), U.C. Berkeley Division of Biostatistics Working Paper Series (2005)
Aims. This study assessed the possibility to build a prognosis predictor, based on non-neoplastic mucosa...
 

PDF

Cluster Analysis of Genomic Data with Applications in R (with Katherine S. Pollard), U.C. Berkeley Division of Biostatistics Working Paper Series (2005)
In this paper, we provide an overview of existing partitioning and hierarchical clustering algorithms in...
 

PDF

Loss-Based Estimation with Cross-Validation: Applications to Microarray Data Analysis and Motif Finding (with Sandrine Dudoit, Sunduz Keles, Annette M. Molinaro, Sandra E. Sinisi, and Siew Leng Teng), U.C. Berkeley Division of Biostatistics Working Paper Series (2003)
Current statistical inference problems in genomic data analysis involve parameter estimation for high-dimensional multivariate distributions,...
 

PDF

Tree-based Multivariate Regression and Density Estimation with Right-Censored Data (with Annette M. Molinaro and Sandrine Dudoit), U.C. Berkeley Division of Biostatistics Working Paper Series (2003)

We propose a unified strategy for estimator construction, selection, and performance assessment in the presence...

 

Laboratory and Basic Science Research

PDF

Supervised Detection of Conserved Motifs in DNA Sequences with cosmo (with Oliver Bembom and Sunduz Keles), U.C. Berkeley Division of Biostatistics Working Paper Series (2006)

Identification of transcription factor binding sites is a major interest in contemporary biological research. A...

 

PDF

Multiple Tests of Association with Biological Annotation Metadata (with Sandrine Dudoit and Sunduz Keles), U.C. Berkeley Division of Biostatistics Working Paper Series (2006)

We propose a general and formal statistical framework for the multiple tests of associations between...

 

PDF

Application of a Multiple Testing Procedure Controlling the Proportion of False Positives to Protein and Bacterial Data (with Merrill D. Birkner and Alan E. Hubbard), U.C. Berkeley Division of Biostatistics Working Paper Series (2005)
Simultaneously testing multiple hypotheses is important in high-dimensional biological studies. In these situations, one is...
 

PDF

Test Statistics Null Distributions in Multiple Testing: Simulation Studies and Applications to Genomics (with Katherine S. Pollard, Merrill D. Birkner, and Sandrine Dudoit), U.C. Berkeley Division of Biostatistics Working Paper Series (2005)

Multiple hypothesis testing problems arise frequently in biomedical and genomic research, for instance, when identifying...

 

PDF

Multiple Testing Procedures: R multtest Package and Applications to Genomics (with Katherine S. Pollard and Sandrine Dudoit), U.C. Berkeley Division of Biostatistics Working Paper Series (2004)
The Bioconductor R package multtest implements widely applicable resampling-based single-step and stepwise multiple testing procedures...
 

Longitudinal Data Analysis and Time Series

PDF

History-Adjusted Marginal Structural Models and Statically-Optimal Dynamic Treatment Regimens (with Maya L. Petersen and Marshall M. Joffe), The International Journal of Biostatistics (2006)
Marginal structural models (MSM) provide a powerful tool for estimating the causal effect of a...
 

PDF

Individualized Treatment Rules: Generating Candidate Clinical Trials (with Maya L. Petersen and Steven G. Deeks), U.C. Berkeley Division of Biostatistics Working Paper Series (2006)
Statistical methods have rarely been applied to learn individualized treatment rules, or rules for altering...
 

PDF

Direct Effect Models (with Maya L. Petersen), U.C. Berkeley Division of Biostatistics Working Paper Series (2005)
The causal effect of a treatment on an outcome is generally mediated by several intermediate...
 

PDF

G-computation Estimation of Nonparametric Causal Effects on Time-Dependent Mean Outcomes in Longitudinal Studies (with Romain Neugebauer), U.C. Berkeley Division of Biostatistics Working Paper Series (2005)
Two approaches to Causal Inference based on Marginal Structural Models (MSM) have been proposed. They...
 

PDF

Causal Inference in Longitudinal Studies with History-Restricted Marginal Structural Models (with Romain Neugebauer and Ira B. Tager), U.C. Berkeley Division of Biostatistics Working Paper Series (2005)
Causal Inference based on Marginal Structural Models (MSMs) is particularly attractive to subject-matter investigators because...
 

Loss-Based Estimation with Cross-Validation

Link

Asymptotic Optimality of Likelihood-Based Cross-Validation (with Sandrine Dudoit and Sunduz Keles), Statistical Applications in Genetics and Molecular Biology (2006)
Likelihood-based cross-validation is a statistical tool for selecting a density estimate based on n i.i.d....
 

Link

Survival Ensembles (with Torsten Hothorn, Peter Buhlmann, Sandrine Dudoit, and Annette M. Molinaro), U.C. Berkeley Division of Biostatistics Working Paper Series (2005)
We propose a unified and flexible framework for ensemble learning in the presence of censoring....
 

Link

Optimization of the Architecture of Neural Networks Using a Deletion/Substitution/Addition Algorithm (with Blythe Durbin and Sandrine Dudoit), U.C. Berkeley Division of Biostatistics Working Paper Series (2005)
Neural networks are a popular machine learning tool, particularly in applications such as the prediction...
 

Link

The Cross-Validated Adaptive Epsilon-Net Estimator (with Sandrine Dudoit and Aad W. van der Vaart), U.C. Berkeley Division of Biostatistics Working Paper Series (2004)
Suppose that we observe a sample of independent and identically distributed realizations of a random...
 

Link

Loss-Based Estimation with Cross-Validation: Applications to Microarray Data Analysis and Motif Finding (with Sandrine Dudoit, Sunduz Keles, Annette M. Molinaro, Sandra E. Sinisi, and Siew Leng Teng), U.C. Berkeley Division of Biostatistics Working Paper Series (2003)
Current statistical inference problems in genomic data analysis involve parameter estimation for high-dimensional multivariate distributions,...
 

Medical Specialties

PDF

Colon Cancer Prognosis Prediction by Gene Expression Profiling (with Alain Barrier and Sandrine Dudoit), U.C. Berkeley Division of Biostatistics Working Paper Series (2005)
Aims. This study assessed the possibility to build a prognosis predictor, based on microarray gene...
 

PDF

Prognosis of Stage II Colon Cancer by Non-Neoplastic Mucosa Gene Expresssion Profiling (with Alain Barrier and Sandrine Dudoit), U.C. Berkeley Division of Biostatistics Working Paper Series (2005)
Aims. This study assessed the possibility to build a prognosis predictor, based on non-neoplastic mucosa...
 

PDF

Multiple Testing Methods For ChIP-Chip High Density Oligonucleotide Array Data (with Sunduz Keles, Sandrine Dudoit, and Simon E. Cawley), U.C. Berkeley Division of Biostatistics Working Paper Series (2004)
Cawley et al. (2004) have recently mapped the locations of binding sites for three transcription...
 

Microarray Data Analysis

Link

Colon Cancer Prognosis Prediction by Gene Expression Profiling (with Alain Barrier and Sandrine Dudoit), U.C. Berkeley Division of Biostatistics Working Paper Series (2005)
Aims. This study assessed the possibility to build a prognosis predictor, based on microarray gene...
 

Link

Prognosis of Stage II Colon Cancer by Non-Neoplastic Mucosa Gene Expresssion Profiling (with Alain Barrier and Sandrine Dudoit), U.C. Berkeley Division of Biostatistics Working Paper Series (2005)
Aims. This study assessed the possibility to build a prognosis predictor, based on non-neoplastic mucosa...
 

Link

Multiple Testing Methods For ChIP-Chip High Density Oligonucleotide Array Data (with Sunduz Keles, Sandrine Dudoit, and Simon E. Cawley), U.C. Berkeley Division of Biostatistics Working Paper Series (2004)
Cawley et al. (2004) have recently mapped the locations of binding sites for three transcription...
 

Microarrays

PDF

Cluster Analysis of Genomic Data with Applications in R (with Katherine S. Pollard), U.C. Berkeley Division of Biostatistics Working Paper Series (2005)
In this paper, we provide an overview of existing partitioning and hierarchical clustering algorithms in...
 

PDF

Regulatory Motif Finding by Logic Regression (with Sunduz Keles and Chris Vulpe), U.C. Berkeley Division of Biostatistics Working Paper Series (2004)

Multiple transcription factors coordinately control transcriptional regulation of genes in eukaryotes. Although multiple computational methods...

 

PDF

A Statistical Method for Constructing Transcriptional Regulatory Networks Using Gene Expression and Sequence Data (with Biao Xing), U.C. Berkeley Division of Biostatistics Working Paper Series (2004)
Transcriptional regulation is one of the most important means of gene regulation. Uncovering transcriptional regulatory...
 

PDF

Loss-Based Estimation with Cross-Validation: Applications to Microarray Data Analysis and Motif Finding (with Sandrine Dudoit, Sunduz Keles, Annette M. Molinaro, Sandra E. Sinisi, and Siew Leng Teng), U.C. Berkeley Division of Biostatistics Working Paper Series (2003)
Current statistical inference problems in genomic data analysis involve parameter estimation for high-dimensional multivariate distributions,...
 

PDF

A Method to Identify Significant Clusters in Gene Expression Data (with Katherine S. Pollard), U.C. Berkeley Division of Biostatistics Working Paper Series (2002)
Clustering algorithms have been widely applied to gene expression data. For both hierarchical and partitioning...
 

Multiple Hypothesis Testing

Link

Multiple Testing. Part II. Step-Down Procedures for Control of the Family-Wise Error Rate (with Sandrine Dudoit and Katherine S. Pollard), Statistical Applications in Genetics and Molecular Biology (2006)
The present article proposes two step-down multiple testing procedures for asymptotic control of the family-wise...
 

Link

Multiple Testing. Part I. Single-Step Procedures for Control of General Type I Error Rates (with Sandrine Dudoit and Katherine S. Pollard), Statistical Applications in Genetics and Molecular Biology (2006)
The present article proposes general single-step multiple testing procedures for controlling Type I error rates...
 

Link

Augmentation Procedures for Control of the Generalized Family-Wise Error Rate and Tail Probabilities for the Proportion of False Positives (with Sandrine Dudoit and Katherine S. Pollard), Statistical Applications in Genetics and Molecular Biology (2006)
This article shows that any single-step or stepwise multiple testing procedure (asymptotically) controlling the family-wise...
 

Link

A Method to Increase the Power of Multiple Testing Procedures Through Sample Splitting (with Daniel Rubin and Sandrine Dudoit), Statistical Applications in Genetics and Molecular Biology (2006)
Consider the standard multiple testing problem where many hypotheses are to be tested, each hypothesis...
 

Link

A Method to Increase the Power of Multiple Testing Procedures Through Sample Splitting (with Daniel Rubin and Sandrine Dudoit), U.C. Berkeley Division of Biostatistics Working Paper Series (2006)
Consider the standard multiple testing problem where many hypotheses are to be tested, each hypothesis...
 

Multivariate Analysis

PDF

Multiple Testing and Data Adaptive Regression: An Application to HIV-1 Sequence Data. (with Merrill D. Birkner and Sandra E. Sinisi), Statistical Applications in Genetics and Molecular Biology (2006)
Analysis of viral strand sequence data and viral replication capacity could potentially lead to biological...
 

PDF

Application of a Variable Importance Measure Method (with Merrill D. Birkner), The International Journal of Biostatistics (2006)
van der Laan (2005) proposed a targeted method used to construct variable importance measures coupled...
 

PDF

Multiple Tests of Association with Biological Annotation Metadata (with Sandrine Dudoit and Sunduz Keles), U.C. Berkeley Division of Biostatistics Working Paper Series (2006)

We propose a general and formal statistical framework for the multiple tests of associations between...

 

PDF

Data Adaptive Pathway Testing (with Merrill D. Birkner and Alan E. Hubbard), U.C. Berkeley Division of Biostatistics Working Paper Series (2005)
A majority of diseases are caused by a combination of factors, for example, composite genetic...
 

PDF

Application of a Variable Importance Measure Method to HIV-1 Sequence Data (with Merrill D. Birkner), U.C. Berkeley Division of Biostatistics Working Paper Series (2005)
van der Laan (2005) proposed a method to construct variable importance measures and provided the...
 

Statistical Computing

Link

Multiple Testing Procedures: R multtest Package and Applications to Genomics (with Katherine S. Pollard and Sandrine Dudoit), U.C. Berkeley Division of Biostatistics Working Paper Series (2004)
The Bioconductor R package multtest implements widely applicable resampling-based single-step and stepwise multiple testing procedures...
 

Statistical Models

PDF

Multiple Testing and Data Adaptive Regression: An Application to HIV-1 Sequence Data. (with Merrill D. Birkner and Sandra E. Sinisi), Statistical Applications in Genetics and Molecular Biology (2006)
Analysis of viral strand sequence data and viral replication capacity could potentially lead to biological...
 

PDF

Deletion/Substitution/Addition Algorithm in Learning with Applications in Genomics (with Sandra E. Sinisi), Statistical Applications in Genetics and Molecular Biology (2006)
van der Laan and Dudoit (2003) provide a road map for estimation and performance assessment...
 

PDF

Cross-Validated Bagged Prediction of Survival (with Sandra E. Sinisi and Romain Neugebauer), Statistical Applications in Genetics and Molecular Biology (2006)
In this article, we show how to apply our previously proposed Deletion/Substitution/Addition algorithm in the...
 

PDF

Super Learning: an Application to Prediction of HIV-1 Drug Susceptibility (with Sandra E. Sinisi and Maya L. Petersen), U.C. Berkeley Division of Biostatistics Working Paper Series (2006)
Many statistical methods exist that can be used to learn a predictor based on observed...
 

PDF

Causal Effect Models for Intention to Treat and Realistic Individualized Treatment Rules, U.C. Berkeley Division of Biostatistics Working Paper Series (2006)

An important class of models in causal inference are the so-called marginal structural models which...

 

Statistical Theory and Methods

PDF

Quantile-Function Based Null Distribution in Resampling Based Multiple Testing (with Alan E. Hubbard), Statistical Applications in Genetics and Molecular Biology (2006)
Simultaneously testing a collection of null hypotheses about a data generating distribution based on a...
 

PDF

Multiple Testing. Part II. Step-Down Procedures for Control of the Family-Wise Error Rate (with Sandrine Dudoit and Katherine S. Pollard), Statistical Applications in Genetics and Molecular Biology (2006)
The present article proposes two step-down multiple testing procedures for asymptotic control of the family-wise...
 

PDF

Multiple Testing. Part I. Single-Step Procedures for Control of General Type I Error Rates (with Sandrine Dudoit and Katherine S. Pollard), Statistical Applications in Genetics and Molecular Biology (2006)
The present article proposes general single-step multiple testing procedures for controlling Type I error rates...
 

PDF

Estimating a Survival Distribution with Current Status Data and High-dimensional Covariates (with Aad van der Vaart), The International Journal of Biostatistics (2006)
We consider the inverse problem of estimating a survival distribution when the survival times are...
 

PDF

Empirical Bayes and Resampling Based Multiple Testing Procedure Controlling Tail Probability of the Proportion of False Positives. (with Merrill D. Birkner and Alan E. Hubbard), Statistical Applications in Genetics and Molecular Biology (2006)
Simultaneously testing a collection of null hypotheses about a data generating distribution based on a...
 

Survival Analysis

PDF

Cross-Validated Bagged Prediction of Survival (with Sandra E. Sinisi and Romain Neugebauer), Statistical Applications in Genetics and Molecular Biology (2006)
In this article, we show how to apply our previously proposed Deletion/Substitution/Addition algorithm in the...
 

PDF

Choice of Monitoring Mechanism for Optimal Nonparametric Functional Estimation for Binary Data (with Nicholas P. Jewell and Stephen Shiboski), The International Journal of Biostatistics (2006)
Optimal designs of dose levels in order to estimate parameters from a model for binary...
 

PDF

Doubly Robust Censoring Unbiased Transformations (with Daniel Rubin), U.C. Berkeley Division of Biostatistics Working Paper Series (2006)
We consider random design nonparametric regression when the response variable is subject to right censoring....
 

PDF

Cross-validated Bagged Prediction of Survival (with Sandra E. Sinisi), U.C. Berkeley Division of Biostatistics Working Paper Series (2005)
In this article, we show how to apply our previously proposed Deletion/Substitution/Addition algorithm in the...
 

PDF

Survival Point Estimate Prediction in Matched and Non-Matched Case-Control Subsample Designed Studies (with Annette M. Molinaro, Dan H. Moore, and Karla Kerlikowske), U.C. Berkeley Division of Biostatistics Working Paper Series (2005)

Providing information about the risk of disease and clinical factors that may increase or...

 

No subject area

Link

Locally Efficient Estimation with Bivariate Right Censored Data (with Christopher M. Quale and James M. Robins), U.C. Berkeley Division of Biostatistics Working Paper Series (2001)
Estimation for bivariate right censored data is a problem that has had much study over...
 

Link

Smooth Estimation of a Monotone Density (with Aad W. van der Vaart), U.C. Berkeley Division of Biostatistics Working Paper Series (2001)
We investigate the interplay of smoothness and monotonicity assumptions when estimating a density from a...
 

Link

Fitting of Mixtures with Unspecified Number of Components Using Cross Validation Distance Estimate (with Maja Miloslavsky), U.C. Berkeley Division of Biostatistics Working Paper Series (2001)
Estimation of the number of mixture components (k) is an unsolved problem. Available methods for...
 

Link

Locally Efficient Estimation in Censored Data Models: Theory and Examples (with Richard D. Gill and James M. Robins), U.C. Berkeley Division of Biostatistics Working Paper Series (2000)
In many applications the observed data can be viewed as a censored high dimensional full...
 

Link

Estimation with Interval Censored Data in Longitudinal Studies, U.C. Berkeley Division of Biostatistics Working Paper Series (1998)

In biostatistical applications interest often focuses on the estimation of the distribution of a time-until-event...