Biological Annotation Metadata Analysis

Multiple Tests of Association with Biological Annotation Metadata (with Sandrine Dudoit and Sunduz Keles), U.C. Berkeley Division of Biostatistics Working Paper Series (2006)

We propose a general and formal statistical framework for the multiple tests of associations between...

 

Biological Sequence Analysis

OpenURL

Supervised Detection of Regulatory Motifs in DNA Sequences (with Sunduz Keles, Sandrine Dudoit, Biao Xing, and Michael B. Eisen ), Statistical Applications in Genetics and Molecular Biology (2006)
Identification of transcription factor binding sites (regulatory motifs) is a major interest in contemporary biology....
 

Supervised Detection of Regulatory Motifs in DNA Sequences (with Sunduz Keles, Sandrine Dudoit, Biao Xing, and Michael B. Eisen), U.C. Berkeley Division of Biostatistics Working Paper Series (2003)
Identification of transcription factor binding sites (regulatory motifs) is a major interest in contemporary...
 

Categorical Data Analysis

PDF

Empirical Bayes and Resampling Based Multiple Testing Procedure Controlling Tail Probability of the Proportion of False Positives. (with Merrill D. Birkner and Alan E. Hubbard), Statistical Applications in Genetics and Molecular Biology (2006)
Simultaneously testing a collection of null hypotheses about a data generating distribution based on a...
 

PDF

Choice of Monitoring Mechanism for Optimal Nonparametric Functional Estimation for Binary Data (with Nicholas P. Jewell and Stephen Shiboski), U.C. Berkeley Division of Biostatistics Working Paper Series (2004)
Optimal designs of dose levels in order to estimate parameters from a model for binary...
 

Choice of Monitoring Mechanism for Optimal Nonparametric Functional Estimation for Binary Data (with Nicholas P. Jewell and Stephen Shiboski), U.C. Berkeley Division of Biostatistics Working Paper Series (2004)
Optimal designs of dose levels in order to estimate parameters from a model for binary...
 

Estimation of Treatment Effects in Randomized Trials with Noncompliance and a Dichotomous Outcome (with Alan E. Hubbard and Nicholas P. Jewell), U.C. Berkeley Division of Biostatistics Working Paper Series (2004)
We propose a class of estimators of the treatment effect on a dichotomous outcome among...
 

PDF

Estimation of Treatment Effects in Randomized Trials with Noncompliance and a Dichotomous Outcome (with Alan E. Hubbard and Nicholas P. Jewell), U.C. Berkeley Division of Biostatistics Working Paper Series (2004)
We propose a class of estimators of the treatment effect on a dichotomous outcome among...
 

PDF

Supervised Detection of Regulatory Motifs in DNA Sequences (with Sunduz Keles, Sandrine Dudoit, Biao Xing, and Michael B. Eisen), U.C. Berkeley Division of Biostatistics Working Paper Series (2003)
Identification of transcription factor binding sites (regulatory motifs) is a major interest in contemporary...
 

Subset Selection Based on Order Statistics from Logistic Populations (with Paul van der Laan), U.C. Berkeley Division of Biostatistics Working Paper Series (1998)

Consider k equal size treatment groups and let the outcome of interest be a survival...

 

Clinical Epidemiology

PDF

History-Adjusted Marginal Structural Models and Statically-Optimal Dynamic Treatment Regimens (with Maya L. Petersen and Marshall M. Joffe), The International Journal of Biostatistics (2006)
Marginal structural models (MSM) provide a powerful tool for estimating the causal effect of a...
 

Clinical Trials

Estimation of Treatment Effects in Randomized Trials with Noncompliance and a Dichotomous Outcome (with Alan E. Hubbard and Nicholas P. Jewell), U.C. Berkeley Division of Biostatistics Working Paper Series (2004)
We propose a class of estimators of the treatment effect on a dichotomous outcome among...
 

PDF

Estimation of Treatment Effects in Randomized Trials with Noncompliance and a Dichotomous Outcome (with Alan E. Hubbard and Nicholas P. Jewell), U.C. Berkeley Division of Biostatistics Working Paper Series (2004)
We propose a class of estimators of the treatment effect on a dichotomous outcome among...
 

PDF

Estimation of Direct and Indirect Causal Effects in Longitudinal Studies (with Maya L. Petersen), U.C. Berkeley Division of Biostatistics Working Paper Series (2004)
The causal effect of a treatment on an outcome is generally mediated by several intermediate...
 

PDF

Comparison of the Inverse Probability of Treatment Weighted (IPTW) Estimator With a Naïve Estimator in the Analysis of Longitudinal Data With Time-Dependent Confounding: A Simulation Study (with Thaddeus Haight, Romain Neugebauer, and Ira B. Tager), U.C. Berkeley Division of Biostatistics Working Paper Series (2003)
A simulation study was conducted to compare estimates from a naïve estimator, using standard conditional...
 

PDF

Measuring Treatment Effects Using Semiparametric Models (with Zhuo Yu), U.C. Berkeley Division of Biostatistics Working Paper Series (2003)
In order to estimate the causal effect of treatments on an outcome of interest, one...
 

PDF

Estimating Causal Parameters in Marginal Structural Models with Unmeasured Confounders Using Instrumental Variables (with Tanya A. Henneman and Alan E. Hubbard), U.C. Berkeley Division of Biostatistics Working Paper Series (2002)
For statisticians analyzing medical data, a significant problem in determining the causal effect of a...
 

Computation

PDF

Empirical Bayes and Resampling Based Multiple Testing Procedure Controlling Tail Probability of the Proportion of False Positives. (with Merrill D. Birkner and Alan E. Hubbard), Statistical Applications in Genetics and Molecular Biology (2006)
Simultaneously testing a collection of null hypotheses about a data generating distribution based on a...
 

PDF

Cluster Analysis of Genomic Data with Applications in R (with Katherine S. Pollard), U.C. Berkeley Division of Biostatistics Working Paper Series (2005)
In this paper, we provide an overview of existing partitioning and hierarchical clustering algorithms in...
 

PDF

Multiple Testing Procedures and Applications to Genomics (with Merrill D. Birkner, Katherine S. Pollard, and Sandrine Dudoit), U.C. Berkeley Division of Biostatistics Working Paper Series (2005)
This chapter proposes widely applicable resampling-based single-step and stepwise multiple testing procedures (MTP) for controlling...
 

PDF

Data Adaptive Estimation of the Treatment Specific Mean (with Yue Wang and Oliver Bembom), U.C. Berkeley Division of Biostatistics Working Paper Series (2004)
An important problem in epidemiology and medical research is the estimation of the causal effect...
 

PDF

History-Adjusted Marginal Structural Models and Statically-Optimal Dynamic Treatment Regimes (with Maya L. Petersen), U.C. Berkeley Division of Biostatistics Working Paper Series (2004)
Marginal structural models (MSM) provide a powerful tool for estimating the causal effect of a...
 

PDF

Loss-Based Cross-Validated Deletion/Substitution/Addition Algorithms in Estimation (with Sandra E. Sinisi), U.C. Berkeley Division of Biostatistics Working Paper Series (2004)
In van der Laan and Dudoit (2003) we propose and theoretically study a unified loss...
 

PDF

Unified Cross-Validation Methodology For Selection Among Estimators and a General Cross-Validated Adaptive Epsilon-Net Estimator: Finite Sample Oracle Inequalities and Examples (with Sandrine Dudoit), U.C. Berkeley Division of Biostatistics Working Paper Series (2003)

In Part I of this article we propose a general cross-validation criterian for selecting among...

 

PDF

Locally Efficient Estimation of Nonparametric Causal Effects on Mean Outcomes in Longitudinal Studies (with Romain Neugebauer), U.C. Berkeley Division of Biostatistics Working Paper Series (2003)
Marginal Structural Models (MSM) have been introduced by Robins (1998a) as a powerful tool for...
 

PDF

Resampling-based Multiple Testing: Asymptotic Control of Type I Error and Applications to Gene Expression Data (with Katherine S. Pollard), U.C. Berkeley Division of Biostatistics Working Paper Series (2003)
We define a general statistical framework for multiple hypothesis testing and show that the correct...
 

PDF

Double Robust Estimation in Longitudinal Marginal Structural Models (with Zhuo Yu), U.C. Berkeley Division of Biostatistics Working Paper Series (2003)
Consider estimation of causal parameters in a marginal structural model for the discrete intensity of...
 

PDF

An Empirical Study of Marginal Structural Models for Time-Independent Treatment (with Tanya A. Henneman), U.C. Berkeley Division of Biostatistics Working Paper Series (2002)
In non-randomized treatment studies a significant problem for statisticians is determining how best to adjust...
 

PDF

Bivariate Current Status Data (with Nicholas P. Jewell), U.C. Berkeley Division of Biostatistics Working Paper Series (2002)
In many applications, it is often of interest to estimate a bivariate distribution of two...
 

Bivariate Current Status Data (with Nicholas P. Jewell), U.C. Berkeley Division of Biostatistics Working Paper Series (2002)
In many applications, it is often of interest to estimate a bivariate distribution of two...
 

PDF

A New Partitioning Around Medoids Algorithm (with Katherine S. Pollard and Jennifer Bryan), U.C. Berkeley Division of Biostatistics Working Paper Series (2002)
Kaufman & Rousseeuw (1990) proposed a clustering algorithm Partitioning Around Medoids (PAM) which maps a...
 

Computational Biology/Bioinformatics

PDF

Supervised Detection of Regulatory Motifs in DNA Sequences (with Sunduz Keles, Sandrine Dudoit, Biao Xing, and Michael B. Eisen ), Statistical Applications in Genetics and Molecular Biology (2006)
Identification of transcription factor binding sites (regulatory motifs) is a major interest in contemporary biology....
 

PDF

Multiple Testing and Data Adaptive Regression: An Application to HIV-1 Sequence Data. (with Merrill D. Birkner and Sandra E. Sinisi), Statistical Applications in Genetics and Molecular Biology (2006)
Analysis of viral strand sequence data and viral replication capacity could potentially lead to biological...
 

PDF

Cluster Analysis of Genomic Data with Applications in R (with Katherine S. Pollard), U.C. Berkeley Division of Biostatistics Working Paper Series (2005)
In this paper, we provide an overview of existing partitioning and hierarchical clustering algorithms in...
 

PDF

Statistical Inference for Simultaneous Clustering of Gene Expression Data (with Katherine S. Pollard), U.C. Berkeley Division of Biostatistics Working Paper Series (2001)
Current methods for analysis of gene expression data are mostly based on clustering and classification...
 

Paired and Unpaired Comparisons and Clustering with Gene Expression Data (with Jennifer F. Bryan and Katherine S. Pollard), U.C. Berkeley Division of Biostatistics Working Paper Series (2001)
We have previously described a statistical framework for using gene expression data from cDNA microarrays...
 

Hybrid Clustering of Gene Expression Data with Visualization and the Bootstrap (with Katherine S. Pollard), U.C. Berkeley Division of Biostatistics Working Paper Series (2001)
Large-scale gene expression studies are coming increasingly common as new technologies make it possible to...
 

Gene Expression Analysis with the Parametric Bootstrap (with Jennifer F. Bryan), U.C. Berkeley Division of Biostatistics Working Paper Series (2000)
Recent developments in microarray technology make it possible to capture the gene expression profiles for...
 

Design of Experiments and Sample Surveys

Gene Expression Analysis with the Parametric Bootstrap (with Jennifer F. Bryan), U.C. Berkeley Division of Biostatistics Working Paper Series (2000)
Recent developments in microarray technology make it possible to capture the gene expression profiles for...
 

Disease Modeling

PDF

Multiple Testing and Data Adaptive Regression: An Application to HIV-1 Sequence Data. (with Merrill D. Birkner and Sandra E. Sinisi), Statistical Applications in Genetics and Molecular Biology (2006)
Analysis of viral strand sequence data and viral replication capacity could potentially lead to biological...
 

PDF

Colon Cancer Prognosis Prediction by Gene Expression Profiling (with Alain Barrier and Sandrine Dudoit), U.C. Berkeley Division of Biostatistics Working Paper Series (2005)
Aims. This study assessed the possibility to build a prognosis predictor, based on microarray gene...
 

PDF

Prognosis of Stage II Colon Cancer by Non-Neoplastic Mucosa Gene Expresssion Profiling (with Alain Barrier and Sandrine Dudoit), U.C. Berkeley Division of Biostatistics Working Paper Series (2005)
Aims. This study assessed the possibility to build a prognosis predictor, based on non-neoplastic mucosa...
 

PDF

Multiple Testing and Data Adaptive Regression: An Application to HIV-1 Sequence Data (with Merrill D. Birkner and Sandra E. Sinisi), U.C. Berkeley Division of Biostatistics Working Paper Series (2004)
Analysis of viral strand sequence data and viral replication capacity could potentially lead to biological...
 

PDF

Comparative Genomic Hybridization Array Analysis (with Annette M. Molinaro and Dan H. Moore), U.C. Berkeley Division of Biostatistics Working Paper Series (2002)

At the present time, there is increasing evidence that cancer may be regulated by the...

 

Efficient Estimation of the Lifetime and Disease Onset Distribution (with Nicholas P. Jewell and Derick R. Peterson), U.C. Berkeley Division of Biostatistics Working Paper Series (1997)
We study efficient nonparametric maximum likelihood estimation of the distribution of onset and lifetime associated...
 

Epidemiology

PDF

History-Adjusted Marginal Structural Models and Statically-Optimal Dynamic Treatment Regimens (with Maya L. Petersen and Marshall M. Joffe), The International Journal of Biostatistics (2006)
Marginal structural models (MSM) provide a powerful tool for estimating the causal effect of a...
 

PDF

Extending Marginal Structural Models through Local, Penalized, and Additive Learning (with Daniel Rubin), U.C. Berkeley Division of Biostatistics Working Paper Series (2006)

Marginal structural models (MSMs) allow one to form causal inferences from data, by specifying a...

 

PDF

History-Adjusted Marginal Structural Models to Estimate Time-Varying Effect Modification (with Maya L. Petersen, Steven G. Deeks, and Jeffrey N. Martin), U.C. Berkeley Division of Biostatistics Working Paper Series (2005)
Much of epidemiology and clinical medicine is focused on the estimation of treatments or interventions...
 

PDF

Population Intervention Models in Causal Inference (with Alan E. Hubbard), U.C. Berkeley Division of Biostatistics Working Paper Series (2005)
Marginal structural models (MSM) provide a powerful tool for estimating the causal effect of a]...
 

PDF

Estimation of Direct Causal Effects (with Maya L. Petersen), U.C. Berkeley Division of Biostatistics Working Paper Series (2005)
Many common problems in epidemiologic and clinical research involve estimating the effect of an exposure...
 

PDF

Direct Effect Models (with Maya L. Petersen), U.C. Berkeley Division of Biostatistics Working Paper Series (2005)
The causal effect of a treatment on an outcome is generally mediated by several intermediate...
 

PDF

Causal Inference in Longitudinal Studies with History-Restricted Marginal Structural Models (with Romain Neugebauer and Ira B. Tager), U.C. Berkeley Division of Biostatistics Working Paper Series (2005)
Causal Inference based on Marginal Structural Models (MSMs) is particularly attractive to subject-matter investigators because...
 

PDF

History-Adjusted Marginal Structural Models: Optimal Treatment Strategies (with Maya L. Petersen), U.C. Berkeley Division of Biostatistics Working Paper Series (2005)
Much of clinical medicine involves choosing a future treatment plan that is expected to optimize...
 

PDF

History-Adjusted Marginal Structural Models: Time-Varying Effect Modification (with Maya L. Petersen), U.C. Berkeley Division of Biostatistics Working Paper Series (2005)
Marginal structural models (MSM) provide a powerful tool for estimating the causal effect of a...
 

PDF

Data Adaptive Estimation of the Treatment Specific Mean (with Yue Wang and Oliver Bembom), U.C. Berkeley Division of Biostatistics Working Paper Series (2004)
An important problem in epidemiology and medical research is the estimation of the causal effect...
 

PDF

History-Adjusted Marginal Structural Models and Statically-Optimal Dynamic Treatment Regimes (with Maya L. Petersen), U.C. Berkeley Division of Biostatistics Working Paper Series (2004)
Marginal structural models (MSM) provide a powerful tool for estimating the causal effect of a...
 

Estimation of Treatment Effects in Randomized Trials with Noncompliance and a Dichotomous Outcome (with Alan E. Hubbard and Nicholas P. Jewell), U.C. Berkeley Division of Biostatistics Working Paper Series (2004)
We propose a class of estimators of the treatment effect on a dichotomous outcome among...
 

PDF

Estimation of Treatment Effects in Randomized Trials with Noncompliance and a Dichotomous Outcome (with Alan E. Hubbard and Nicholas P. Jewell), U.C. Berkeley Division of Biostatistics Working Paper Series (2004)
We propose a class of estimators of the treatment effect on a dichotomous outcome among...
 

PDF

Estimation of Direct and Indirect Causal Effects in Longitudinal Studies (with Maya L. Petersen), U.C. Berkeley Division of Biostatistics Working Paper Series (2004)
The causal effect of a treatment on an outcome is generally mediated by several intermediate...
 

PDF

Multiple Testing Methods For ChIP-Chip High Density Oligonucleotide Array Data (with Sunduz Keles, Sandrine Dudoit, and Simon E. Cawley), U.C. Berkeley Division of Biostatistics Working Paper Series (2004)
Cawley et al. (2004) have recently mapped the locations of binding sites for three transcription...
 

PDF

Analysis of Longitudinal Marginal Structural Models (with Jennifer F. Bryan and Zhuo Yu), U.C. Berkeley Division of Biostatistics Working Paper Series (2002)
In this article we construct and study estimators of the causal effect of a time-dependent...
 

PDF

An Empirical Study of Marginal Structural Models for Time-Independent Treatment (with Tanya A. Henneman), U.C. Berkeley Division of Biostatistics Working Paper Series (2002)
In non-randomized treatment studies a significant problem for statisticians is determining how best to adjust...
 

Case-Control Current Status Data (with Nicholas P. Jewell), U.C. Berkeley Division of Biostatistics Working Paper Series (2002)
Current status observation on survival times has recently been widely studied. An extreme form of...
 

PDF

Case-Control Current Status Data (with Nicholas P. Jewell), U.C. Berkeley Division of Biostatistics Working Paper Series (2002)
Current status observation on survival times has recently been widely studied. An extreme form of...
 

Current Status Data: Review, Recent Developments and Open Problems (with Nicholas P. Jewell), U.C. Berkeley Division of Biostatistics Working Paper Series (2002)
Researchers working with survival data are by now adept at handling issues associated with incomplete...
 

PDF

Current Status Data: Review, Recent Developments and Open Problems (with Nicholas P. Jewell), U.C. Berkeley Division of Biostatistics Working Paper Series (2002)
Researchers working with survival data are by now adept at handling issues associated with incomplete...
 

PDF

Estimating Causal Parameters in Marginal Structural Models with Unmeasured Confounders Using Instrumental Variables (with Tanya A. Henneman and Alan E. Hubbard), U.C. Berkeley Division of Biostatistics Working Paper Series (2002)
For statisticians analyzing medical data, a significant problem in determining the causal effect of a...
 

Efficient Estimation of the Lifetime and Disease Onset Distribution (with Nicholas P. Jewell and Derick R. Peterson), U.C. Berkeley Division of Biostatistics Working Paper Series (1997)
We study efficient nonparametric maximum likelihood estimation of the distribution of onset and lifetime associated...
 

General Biostatistics

PDF

Statistical Inference for Variable Importance, The International Journal of Biostatistics (2006)
Many statistical problems involve the learning of an importance/effect of a variable for predicting an...
 

PDF

Issues of Processing and Multiple Testing of SELDI-TOF MS Proteomic Data (with Merrill D. Birkner, Alan E. Hubbard, Christine F. Skibola, Christine M. Hegedus, and Martyn T. Smith), Statistical Applications in Genetics and Molecular Biology (2006)
A new data filtering method for SELDI-TOF MS proteomic spectra data is described. We examined...
 

PDF

History-Adjusted Marginal Structural Models and Statically-Optimal Dynamic Treatment Regimens (with Maya L. Petersen and Marshall M. Joffe), The International Journal of Biostatistics (2006)
Marginal structural models (MSM) provide a powerful tool for estimating the causal effect of a...
 

PDF

Augmentation Procedures for Control of the Generalized Family-Wise Error Rate and Tail Probabilities for the Proportion of False Positives (with Sandrine Dudoit and Katherine S. Pollard), Statistical Applications in Genetics and Molecular Biology (2006)
This article shows that any single-step or stepwise multiple testing procedure (asymptotically) controlling the family-wise...
 

PDF

Multiple Tests of Association with Biological Annotation Metadata (with Sandrine Dudoit and Sunduz Keles), U.C. Berkeley Division of Biostatistics Working Paper Series (2006)

We propose a general and formal statistical framework for the multiple tests of associations between...

 

PDF

Issues of Processing and Multiple Testing of SELDI-TOF MS Proteomic Data (with Merrill D. Birkner, Alan E. Hubbard, Christine F. Skibola, Christine M. Hegedus, and Martyn T. Smith), U.C. Berkeley Division of Biostatistics Working Paper Series (2005)
A new data filtering method for SELDI-TOF MS proteomic spectra data is described. We examined...
 

PDF

Data Adaptive Pathway Testing (with Merrill D. Birkner and Alan E. Hubbard), U.C. Berkeley Division of Biostatistics Working Paper Series (2005)
A majority of diseases are caused by a combination of factors, for example, composite genetic...
 

PDF

Application of a Variable Importance Measure Method to HIV-1 Sequence Data (with Merrill D. Birkner), U.C. Berkeley Division of Biostatistics Working Paper Series (2005)
van der Laan (2005) proposed a method to construct variable importance measures and provided the...
 

PDF

Population Intervention Models in Causal Inference (with Alan E. Hubbard), U.C. Berkeley Division of Biostatistics Working Paper Series (2005)
Marginal structural models (MSM) provide a powerful tool for estimating the causal effect of a]...
 

PDF

Direct Effect Models (with Maya L. Petersen), U.C. Berkeley Division of Biostatistics Working Paper Series (2005)
The causal effect of a treatment on an outcome is generally mediated by several intermediate...
 

PDF

Statistical Inference for Variable Importance, U.C. Berkeley Division of Biostatistics Working Paper Series (2005)
Many statistical problems involve the learning of an importance/effect of a variable for predicting an...
 

PDF

Application of a Multiple Testing Procedure Controlling the Proportion of False Positives to Protein and Bacterial Data (with Merrill D. Birkner and Alan E. Hubbard), U.C. Berkeley Division of Biostatistics Working Paper Series (2005)
Simultaneously testing multiple hypotheses is important in high-dimensional biological studies. In these situations, one is...
 

PDF

Test Statistics Null Distributions in Multiple Testing: Simulation Studies and Applications to Genomics (with Katherine S. Pollard, Merrill D. Birkner, and Sandrine Dudoit), U.C. Berkeley Division of Biostatistics Working Paper Series (2005)

Multiple hypothesis testing problems arise frequently in biomedical and genomic research, for instance, when identifying...

 

PDF

Estimating Function Based Cross-Validation and Learning (with Daniel Rubin), U.C. Berkeley Division of Biostatistics Working Paper Series (2005)
Suppose that we observe a sample of independent and identically distributed realizations of a...
 

PDF

Causal Inference in Longitudinal Studies with History-Restricted Marginal Structural Models (with Romain Neugebauer and Ira B. Tager), U.C. Berkeley Division of Biostatistics Working Paper Series (2005)
Causal Inference based on Marginal Structural Models (MSMs) is particularly attractive to subject-matter investigators because...
 

PDF

Optimization of the Architecture of Neural Networks Using a Deletion/Substitution/Addition Algorithm (with Blythe Durbin and Sandrine Dudoit), U.C. Berkeley Division of Biostatistics Working Paper Series (2005)
Neural networks are a popular machine learning tool, particularly in applications such as the prediction...
 

Genetics

PDF

Multiple Tests of Association with Biological Annotation Metadata (with Sandrine Dudoit and Sunduz Keles), U.C. Berkeley Division of Biostatistics Working Paper Series (2006)

We propose a general and formal statistical framework for the multiple tests of associations between...

 

Human Genetics

PDF

Application of a Multiple Testing Procedure Controlling the Proportion of False Positives to Protein and Bacterial Data (with Merrill D. Birkner and Alan E. Hubbard), U.C. Berkeley Division of Biostatistics Working Paper Series (2005)
Simultaneously testing multiple hypotheses is important in high-dimensional biological studies. In these situations, one is...
 

PDF

Prognosis of Stage II Colon Cancer by Non-Neoplastic Mucosa Gene Expresssion Profiling (with Alain Barrier and Sandrine Dudoit), U.C. Berkeley Division of Biostatistics Working Paper Series (2005)
Aims. This study assessed the possibility to build a prognosis predictor, based on non-neoplastic mucosa...
 

PDF

Cluster Analysis of Genomic Data with Applications in R (with Katherine S. Pollard), U.C. Berkeley Division of Biostatistics Working Paper Series (2005)
In this paper, we provide an overview of existing partitioning and hierarchical clustering algorithms in...
 

PDF

Loss-Based Estimation with Cross-Validation: Applications to Microarray Data Analysis and Motif Finding (with Sandrine Dudoit, Sunduz Keles, Annette M. Molinaro, Sandra E. Sinisi, and Siew Leng Teng), U.C. Berkeley Division of Biostatistics Working Paper Series (2003)
Current statistical inference problems in genomic data analysis involve parameter estimation for high-dimensional multivariate distributions,...
 

PDF

Tree-based Multivariate Regression and Density Estimation with Right-Censored Data (with Annette M. Molinaro and Sandrine Dudoit), U.C. Berkeley Division of Biostatistics Working Paper Series (2003)

We propose a unified strategy for estimator construction, selection, and performance assessment in the presence...

 

PDF

Supervised Detection of Regulatory Motifs in DNA Sequences (with Sunduz Keles, Sandrine Dudoit, Biao Xing, and Michael B. Eisen), U.C. Berkeley Division of Biostatistics Working Paper Series (2003)
Identification of transcription factor binding sites (regulatory motifs) is a major interest in contemporary...
 

PDF

A Method to Identify Significant Clusters in Gene Expression Data (with Katherine S. Pollard), U.C. Berkeley Division of Biostatistics Working Paper Series (2002)
Clustering algorithms have been widely applied to gene expression data. For both hierarchical and partitioning...
 

PDF

Comparative Genomic Hybridization Array Analysis (with Annette M. Molinaro and Dan H. Moore), U.C. Berkeley Division of Biostatistics Working Paper Series (2002)

At the present time, there is increasing evidence that cancer may be regulated by the...

 

PDF

A New Partitioning Around Medoids Algorithm (with Katherine S. Pollard and Jennifer Bryan), U.C. Berkeley Division of Biostatistics Working Paper Series (2002)
Kaufman & Rousseeuw (1990) proposed a clustering algorithm Partitioning Around Medoids (PAM) which maps a...
 

PDF

Identification of Regulatory Elements Using A Feature Selection Method (with Sunduz Keles and Michael B. Eisen), U.C. Berkeley Division of Biostatistics Working Paper Series (2001)
Many methods have been described to identify regulatory motifs in the transcription control regions of...
 

Gene Expression Analysis with the Parametric Bootstrap (with Jennifer F. Bryan), U.C. Berkeley Division of Biostatistics Working Paper Series (2000)
Recent developments in microarray technology make it possible to capture the gene expression profiles for...
 

Laboratory and Basic Science Research

PDF

Supervised Detection of Conserved Motifs in DNA Sequences with cosmo (with Oliver Bembom and Sunduz Keles), U.C. Berkeley Division of Biostatistics Working Paper Series (2006)

Identification of transcription factor binding sites is a major interest in contemporary biological research. A...

 

PDF

Multiple Tests of Association with Biological Annotation Metadata (with Sandrine Dudoit and Sunduz Keles), U.C. Berkeley Division of Biostatistics Working Paper Series (2006)

We propose a general and formal statistical framework for the multiple tests of associations between...

 

PDF

Application of a Multiple Testing Procedure Controlling the Proportion of False Positives to Protein and Bacterial Data (with Merrill D. Birkner and Alan E. Hubbard), U.C. Berkeley Division of Biostatistics Working Paper Series (2005)
Simultaneously testing multiple hypotheses is important in high-dimensional biological studies. In these situations, one is...
 

PDF

Test Statistics Null Distributions in Multiple Testing: Simulation Studies and Applications to Genomics (with Katherine S. Pollard, Merrill D. Birkner, and Sandrine Dudoit), U.C. Berkeley Division of Biostatistics Working Paper Series (2005)

Multiple hypothesis testing problems arise frequently in biomedical and genomic research, for instance, when identifying...

 

PDF

Multiple Testing Procedures: R multtest Package and Applications to Genomics (with Katherine S. Pollard and Sandrine Dudoit), U.C. Berkeley Division of Biostatistics Working Paper Series (2004)
The Bioconductor R package multtest implements widely applicable resampling-based single-step and stepwise multiple testing procedures...
 

Longitudinal Data Analysis and Time Series

PDF

History-Adjusted Marginal Structural Models and Statically-Optimal Dynamic Treatment Regimens (with Maya L. Petersen and Marshall M. Joffe), The International Journal of Biostatistics (2006)
Marginal structural models (MSM) provide a powerful tool for estimating the causal effect of a...
 

PDF

Individualized Treatment Rules: Generating Candidate Clinical Trials (with Maya L. Petersen and Steven G. Deeks), U.C. Berkeley Division of Biostatistics Working Paper Series (2006)
Statistical methods have rarely been applied to learn individualized treatment rules, or rules for altering...
 

PDF

Direct Effect Models (with Maya L. Petersen), U.C. Berkeley Division of Biostatistics Working Paper Series (2005)
The causal effect of a treatment on an outcome is generally mediated by several intermediate...
 

PDF

G-computation Estimation of Nonparametric Causal Effects on Time-Dependent Mean Outcomes in Longitudinal Studies (with Romain Neugebauer), U.C. Berkeley Division of Biostatistics Working Paper Series (2005)
Two approaches to Causal Inference based on Marginal Structural Models (MSM) have been proposed. They...
 

PDF

Causal Inference in Longitudinal Studies with History-Restricted Marginal Structural Models (with Romain Neugebauer and Ira B. Tager), U.C. Berkeley Division of Biostatistics Working Paper Series (2005)
Causal Inference based on Marginal Structural Models (MSMs) is particularly attractive to subject-matter investigators because...
 

PDF

A Causal Inference Approach for Constructing Transcriptional Regulatory Networks (with Biao Xing), U.C. Berkeley Division of Biostatistics Working Paper Series (2005)
Transcriptional regulatory networks specify the interactions among regulatory genes and between regulatory genes and their...
 

PDF

Estimation of Direct and Indirect Causal Effects in Longitudinal Studies (with Maya L. Petersen), U.C. Berkeley Division of Biostatistics Working Paper Series (2004)
The causal effect of a treatment on an outcome is generally mediated by several intermediate...
 

PDF

Comparison of the Inverse Probability of Treatment Weighted (IPTW) Estimator With a Naïve Estimator in the Analysis of Longitudinal Data With Time-Dependent Confounding: A Simulation Study (with Thaddeus Haight, Romain Neugebauer, and Ira B. Tager), U.C. Berkeley Division of Biostatistics Working Paper Series (2003)
A simulation study was conducted to compare estimates from a naïve estimator, using standard conditional...
 

PDF

Locally Efficient Estimation of Nonparametric Causal Effects on Mean Outcomes in Longitudinal Studies (with Romain Neugebauer), U.C. Berkeley Division of Biostatistics Working Paper Series (2003)
Marginal Structural Models (MSM) have been introduced by Robins (1998a) as a powerful tool for...
 

PDF

Double Robust Estimation in Longitudinal Marginal Structural Models (with Zhuo Yu), U.C. Berkeley Division of Biostatistics Working Paper Series (2003)
Consider estimation of causal parameters in a marginal structural model for the discrete intensity of...
 

PDF

Construction of Counterfactuals and the G-computation Formula (with Zhuo Yu), U.C. Berkeley Division of Biostatistics Working Paper Series (2002)
Robins' causal inference theory assumes existence of treatment specific counterfactual variables so that the observed...
 

PDF

Analysis of Longitudinal Marginal Structural Models (with Jennifer F. Bryan and Zhuo Yu), U.C. Berkeley Division of Biostatistics Working Paper Series (2002)
In this article we construct and study estimators of the causal effect of a time-dependent...
 

PDF

Estimating Causal Parameters in Marginal Structural Models with Unmeasured Confounders Using Instrumental Variables (with Tanya A. Henneman and Alan E. Hubbard), U.C. Berkeley Division of Biostatistics Working Paper Series (2002)
For statisticians analyzing medical data, a significant problem in determining the causal effect of a...
 

Locally Efficient Estimation of a Multivariate Survival Function in Longitudinal Studies (with Alan E. Hubbard and James M. Robins), U.C. Berkeley Division of Biostatistics Working Paper Series (2000)
In this paper we develop a locally efficient one-step estimator of a multivariate survival function...
 

Locally Efficient Estimation of the Survival Distribution with Right Censored Data and Covariates When Collection of Data is Delayed (with Alan E. Hubbard), U.C. Berkeley Division of Biostatistics Working Paper Series (1997)

For many sources of survival data, there is a delay between the recording of vital...

 

Locally Efficient Estimation with Current Status Data and Time-Dependent Covariates (with James M. Robins), U.C. Berkeley Division of Biostatistics Working Paper Series (1997)

In biostatistical applications interest often focuses on the estimation of the distribution of a failure...

 

Loss-Based Estimation with Cross-Validation

OpenURL

Asymptotic Optimality of Likelihood-Based Cross-Validation (with Sandrine Dudoit and Sunduz Keles), Statistical Applications in Genetics and Molecular Biology (2006)
Likelihood-based cross-validation is a statistical tool for selecting a density estimate based on n i.i.d....
 

Survival Ensembles (with Torsten Hothorn, Peter Buhlmann, Sandrine Dudoit, and Annette M. Molinaro), U.C. Berkeley Division of Biostatistics Working Paper Series (2005)
We propose a unified and flexible framework for ensemble learning in the presence of censoring....
 

Optimization of the Architecture of Neural Networks Using a Deletion/Substitution/Addition Algorithm (with Blythe Durbin and Sandrine Dudoit), U.C. Berkeley Division of Biostatistics Working Paper Series (2005)
Neural networks are a popular machine learning tool, particularly in applications such as the prediction...
 

The Cross-Validated Adaptive Epsilon-Net Estimator (with Sandrine Dudoit and Aad W. van der Vaart), U.C. Berkeley Division of Biostatistics Working Paper Series (2004)
Suppose that we observe a sample of independent and identically distributed realizations of a random...
 

Loss-Based Estimation with Cross-Validation: Applications to Microarray Data Analysis and Motif Finding (with Sandrine Dudoit, Sunduz Keles, Annette M. Molinaro, Sandra E. Sinisi, and Siew Leng Teng), U.C. Berkeley Division of Biostatistics Working Paper Series (2003)
Current statistical inference problems in genomic data analysis involve parameter estimation for high-dimensional multivariate distributions,...
 

Unified Cross-Validation Methodology For Selection Among Estimators and a General Cross-Validated Adaptive Epsilon-Net Estimator: Finite Sample Oracle Inequalities and Examples (with Sandrine Dudoit), U.C. Berkeley Division of Biostatistics Working Paper Series (2003)

In Part I of this article we propose a general cross-validation criterian for selecting among...

 

Asymptotically Optimal Model Selection Method with Right Censored Outcomes (with Sunduz Keles and Sandrine Dudoit), U.C. Berkeley Division of Biostatistics Working Paper Series (2003)
Over the last two decades, non-parametric and semi-parametric approaches that adapt well known techniques such...
 

Tree-based Multivariate Regression and Density Estimation with Right-Censored Data (with Annette M. Molinaro and Sandrine Dudoit), U.C. Berkeley Division of Biostatistics Working Paper Series (2003)

We propose a unified strategy for estimator construction, selection, and performance assessment in the presence...

 

Asymptotic Optimality of Likelihood Based Cross-Validation (with Sandrine Dudoit and Sunduz Keles), U.C. Berkeley Division of Biostatistics Working Paper Series (2003)
Likelihood-based cross-validation is a statistical tool for selecting a density estimate based on n i.i.d....
 

Asymptotics of Cross-Validated Risk Estimation in Estimator Selection and Performance Assessment (with Sandrine Dudoit), U.C. Berkeley Division of Biostatistics Working Paper Series (2003)
Risk estimation is an important statistical question for the purposes of selecting a good estimator...
 

Medical Specialties

PDF

Colon Cancer Prognosis Prediction by Gene Expression Profiling (with Alain Barrier and Sandrine Dudoit), U.C. Berkeley Division of Biostatistics Working Paper Series (2005)
Aims. This study assessed the possibility to build a prognosis predictor, based on microarray gene...
 

PDF

Prognosis of Stage II Colon Cancer by Non-Neoplastic Mucosa Gene Expresssion Profiling (with Alain Barrier and Sandrine Dudoit), U.C. Berkeley Division of Biostatistics Working Paper Series (2005)
Aims. This study assessed the possibility to build a prognosis predictor, based on non-neoplastic mucosa...
 

PDF

Multiple Testing Methods For ChIP-Chip High Density Oligonucleotide Array Data (with Sunduz Keles, Sandrine Dudoit, and Simon E. Cawley), U.C. Berkeley Division of Biostatistics Working Paper Series (2004)
Cawley et al. (2004) have recently mapped the locations of binding sites for three transcription...
 

Microarray Data Analysis

Colon Cancer Prognosis Prediction by Gene Expression Profiling (with Alain Barrier and Sandrine Dudoit), U.C. Berkeley Division of Biostatistics Working Paper Series (2005)
Aims. This study assessed the possibility to build a prognosis predictor, based on microarray gene...
 

Prognosis of Stage II Colon Cancer by Non-Neoplastic Mucosa Gene Expresssion Profiling (with Alain Barrier and Sandrine Dudoit), U.C. Berkeley Division of Biostatistics Working Paper Series (2005)
Aims. This study assessed the possibility to build a prognosis predictor, based on non-neoplastic mucosa...
 

Multiple Testing Methods For ChIP-Chip High Density Oligonucleotide Array Data (with Sunduz Keles, Sandrine Dudoit, and Simon E. Cawley), U.C. Berkeley Division of Biostatistics Working Paper Series (2004)
Cawley et al. (2004) have recently mapped the locations of binding sites for three transcription...
 

Microarrays

PDF

Cluster Analysis of Genomic Data with Applications in R (with Katherine S. Pollard), U.C. Berkeley Division of Biostatistics Working Paper Series (2005)
In this paper, we provide an overview of existing partitioning and hierarchical clustering algorithms in...
 

PDF

Regulatory Motif Finding by Logic Regression (with Sunduz Keles and Chris Vulpe), U.C. Berkeley Division of Biostatistics Working Paper Series (2004)

Multiple transcription factors coordinately control transcriptional regulation of genes in eukaryotes. Although multiple computational methods...

 

PDF

A Statistical Method for Constructing Transcriptional Regulatory Networks Using Gene Expression and Sequence Data (with Biao Xing), U.C. Berkeley Division of Biostatistics Working Paper Series (2004)
Transcriptional regulation is one of the most important means of gene regulation. Uncovering transcriptional regulatory...
 

PDF

Loss-Based Estimation with Cross-Validation: Applications to Microarray Data Analysis and Motif Finding (with Sandrine Dudoit, Sunduz Keles, Annette M. Molinaro, Sandra E. Sinisi, and Siew Leng Teng), U.C. Berkeley Division of Biostatistics Working Paper Series (2003)
Current statistical inference problems in genomic data analysis involve parameter estimation for high-dimensional multivariate distributions,...
 

PDF

A Method to Identify Significant Clusters in Gene Expression Data (with Katherine S. Pollard), U.C. Berkeley Division of Biostatistics Working Paper Series (2002)
Clustering algorithms have been widely applied to gene expression data. For both hierarchical and partitioning...
 

PDF

Comparative Genomic Hybridization Array Analysis (with Annette M. Molinaro and Dan H. Moore), U.C. Berkeley Division of Biostatistics Working Paper Series (2002)

At the present time, there is increasing evidence that cancer may be regulated by the...

 

PDF

Identification of Regulatory Elements Using A Feature Selection Method (with Sunduz Keles and Michael B. Eisen), U.C. Berkeley Division of Biostatistics Working Paper Series (2001)
Many methods have been described to identify regulatory motifs in the transcription control regions of...
 

PDF

Statistical Inference for Simultaneous Clustering of Gene Expression Data (with Katherine S. Pollard), U.C. Berkeley Division of Biostatistics Working Paper Series (2001)
Current methods for analysis of gene expression data are mostly based on clustering and classification...
 

Paired and Unpaired Comparisons and Clustering with Gene Expression Data (with Jennifer F. Bryan and Katherine S. Pollard), U.C. Berkeley Division of Biostatistics Working Paper Series (2001)
We have previously described a statistical framework for using gene expression data from cDNA microarrays...
 

Gene Expression Analysis with the Parametric Bootstrap (with Jennifer F. Bryan), U.C. Berkeley Division of Biostatistics Working Paper Series (2000)
Recent developments in microarray technology make it possible to capture the gene expression profiles for...
 

Multiple Hypothesis Testing

OpenURL

Multiple Testing. Part II. Step-Down Procedures for Control of the Family-Wise Error Rate (with Sandrine Dudoit and Katherine S. Pollard), Statistical Applications in Genetics and Molecular Biology (2006)
The present article proposes two step-down multiple testing procedures for asymptotic control of the family-wise...
 

OpenURL

Multiple Testing. Part I. Single-Step Procedures for Control of General Type I Error Rates (with Sandrine Dudoit and Katherine S. Pollard), Statistical Applications in Genetics and Molecular Biology (2006)
The present article proposes general single-step multiple testing procedures for controlling Type I error rates...
 

OpenURL

Augmentation Procedures for Control of the Generalized Family-Wise Error Rate and Tail Probabilities for the Proportion of False Positives (with Sandrine Dudoit and Katherine S. Pollard), Statistical Applications in Genetics and Molecular Biology (2006)
This article shows that any single-step or stepwise multiple testing procedure (asymptotically) controlling the family-wise...
 

OpenURL

A Method to Increase the Power of Multiple Testing Procedures Through Sample Splitting (with Daniel Rubin and Sandrine Dudoit), Statistical Applications in Genetics and Molecular Biology (2006)
Consider the standard multiple testing problem where many hypotheses are to be tested, each hypothesis...
 

A Method to Increase the Power of Multiple Testing Procedures Through Sample Splitting (with Daniel Rubin and Sandrine Dudoit), U.C. Berkeley Division of Biostatistics Working Paper Series (2006)
Consider the standard multiple testing problem where many hypotheses are to be tested, each hypothesis...
 

Test Statistics Null Distributions in Multiple Testing: Simulation Studies and Applications to Genomics (with Katherine S. Pollard, Merrill D. Birkner, and Sandrine Dudoit), U.C. Berkeley Division of Biostatistics Working Paper Series (2005)

Multiple hypothesis testing problems arise frequently in biomedical and genomic research, for instance, when identifying...

 

Multiple Testing Procedures and Applications to Genomics (with Merrill D. Birkner, Katherine S. Pollard, and Sandrine Dudoit), U.C. Berkeley Division of Biostatistics Working Paper Series (2005)
This chapter proposes widely applicable resampling-based single-step and stepwise multiple testing procedures (MTP) for controlling...
 

Multiple Testing Procedures for Controlling Tail Probability Error Rates (with Sandrine Dudoit and Merrill D. Birkner), U.C. Berkeley Division of Biostatistics Working Paper Series (2004)
The present article discusses and compares multiple testing procedures (MTP) for controlling Type I error...
 

Multiple Testing. Part III. Procedures for Control of the Generalized Family-Wise Error Rate and Proportion of False Positives (with Sandrine Dudoit and Katherine S. Pollard), U.C. Berkeley Division of Biostatistics Working Paper Series (2004)
The accompanying articles by Dudoit et al. (2003b) and van der Laan et al. (2003)...
 

Multiple Testing. Part II. Step-Down Procedures for Control of the Family-Wise Error Rate (with Sandrine Dudoit and Katherine S. Pollard), U.C. Berkeley Division of Biostatistics Working Paper Series (2003)
The present article proposes two step-down multiple testing procedures for asymptotic control of the family-wise...
 

Multiple Testing. Part I. Single-Step Procedures for Control of General Type I Error Rates (with Sandrine Dudoit and Katherine S. Pollard), U.C. Berkeley Division of Biostatistics Working Paper Series (2003)
The present article proposes general single-step multiple testing procedures for controlling Type I error rates...
 

Multivariate Analysis

PDF

Multiple Testing and Data Adaptive Regression: An Application to HIV-1 Sequence Data. (with Merrill D. Birkner and Sandra E. Sinisi), Statistical Applications in Genetics and Molecular Biology (2006)
Analysis of viral strand sequence data and viral replication capacity could potentially lead to biological...
 

PDF

Application of a Variable Importance Measure Method (with Merrill D. Birkner), The International Journal of Biostatistics (2006)
van der Laan (2005) proposed a targeted method used to construct variable importance measures coupled...
 

PDF

Multiple Tests of Association with Biological Annotation Metadata (with Sandrine Dudoit and Sunduz Keles), U.C. Berkeley Division of Biostatistics Working Paper Series (2006)

We propose a general and formal statistical framework for the multiple tests of associations between...

 

PDF

Data Adaptive Pathway Testing (with Merrill D. Birkner and Alan E. Hubbard), U.C. Berkeley Division of Biostatistics Working Paper Series (2005)
A majority of diseases are caused by a combination of factors, for example, composite genetic...
 

PDF

Application of a Variable Importance Measure Method to HIV-1 Sequence Data (with Merrill D. Birkner), U.C. Berkeley Division of Biostatistics Working Paper Series (2005)
van der Laan (2005) proposed a method to construct variable importance measures and provided the...
 

PDF

Statistical Inference for Variable Importance, U.C. Berkeley Division of Biostatistics Working Paper Series (2005)
Many statistical problems involve the learning of an importance/effect of a variable for predicting an...
 

PDF

Test Statistics Null Distributions in Multiple Testing: Simulation Studies and Applications to Genomics (with Katherine S. Pollard, Merrill D. Birkner, and Sandrine Dudoit), U.C. Berkeley Division of Biostatistics Working Paper Series (2005)

Multiple hypothesis testing problems arise frequently in biomedical and genomic research, for instance, when identifying...

 

PDF

Colon Cancer Prognosis Prediction by Gene Expression Profiling (with Alain Barrier and Sandrine Dudoit), U.C. Berkeley Division of Biostatistics Working Paper Series (2005)
Aims. This study assessed the possibility to build a prognosis predictor, based on microarray gene...
 

PDF

Prognosis of Stage II Colon Cancer by Non-Neoplastic Mucosa Gene Expresssion Profiling (with Alain Barrier and Sandrine Dudoit), U.C. Berkeley Division of Biostatistics Working Paper Series (2005)
Aims. This study assessed the possibility to build a prognosis predictor, based on non-neoplastic mucosa...
 

PDF

Causal Inference in Longitudinal Studies with History-Restricted Marginal Structural Models (with Romain Neugebauer and Ira B. Tager), U.C. Berkeley Division of Biostatistics Working Paper Series (2005)
Causal Inference based on Marginal Structural Models (MSMs) is particularly attractive to subject-matter investigators because...
 

PDF

Survival Ensembles (with Torsten Hothorn, Peter Buhlmann, Sandrine Dudoit, and Annette M. Molinaro), U.C. Berkeley Division of Biostatistics Working Paper Series (2005)
We propose a unified and flexible framework for ensemble learning in the presence of censoring....
 

PDF

Cluster Analysis of Genomic Data with Applications in R (with Katherine S. Pollard), U.C. Berkeley Division of Biostatistics Working Paper Series (2005)
In this paper, we provide an overview of existing partitioning and hierarchical clustering algorithms in...
 

PDF

Multiple Testing Procedures: R multtest Package and Applications to Genomics (with Katherine S. Pollard and Sandrine Dudoit), U.C. Berkeley Division of Biostatistics Working Paper Series (2004)
The Bioconductor R package multtest implements widely applicable resampling-based single-step and stepwise multiple testing procedures...
 

PDF

Loss-Based Estimation with Cross-Validation: Applications to Microarray Data Analysis and Motif Finding (with Sandrine Dudoit, Sunduz Keles, Annette M. Molinaro, Sandra E. Sinisi, and Siew Leng Teng), U.C. Berkeley Division of Biostatistics Working Paper Series (2003)
Current statistical inference problems in genomic data analysis involve parameter estimation for high-dimensional multivariate distributions,...
 

PDF

Tree-based Multivariate Regression and Density Estimation with Right-Censored Data (with Annette M. Molinaro and Sandrine Dudoit), U.C. Berkeley Division of Biostatistics Working Paper Series (2003)

We propose a unified strategy for estimator construction, selection, and performance assessment in the presence...

 

PDF

A Method to Identify Significant Clusters in Gene Expression Data (with Katherine S. Pollard), U.C. Berkeley Division of Biostatistics Working Paper Series (2002)
Clustering algorithms have been widely applied to gene expression data. For both hierarchical and partitioning...
 

PDF

Comparative Genomic Hybridization Array Analysis (with Annette M. Molinaro and Dan H. Moore), U.C. Berkeley Division of Biostatistics Working Paper Series (2002)

At the present time, there is increasing evidence that cancer may be regulated by the...

 

PDF

A New Partitioning Around Medoids Algorithm (with Katherine S. Pollard and Jennifer Bryan), U.C. Berkeley Division of Biostatistics Working Paper Series (2002)
Kaufman & Rousseeuw (1990) proposed a clustering algorithm Partitioning Around Medoids (PAM) which maps a...
 

PDF

Statistical Inference for Simultaneous Clustering of Gene Expression Data (with Katherine S. Pollard), U.C. Berkeley Division of Biostatistics Working Paper Series (2001)
Current methods for analysis of gene expression data are mostly based on clustering and classification...
 

Paired and Unpaired Comparisons and Clustering with Gene Expression Data (with Jennifer F. Bryan and Katherine S. Pollard), U.C. Berkeley Division of Biostatistics Working Paper Series (2001)
We have previously described a statistical framework for using gene expression data from cDNA microarrays...
 

Hybrid Clustering of Gene Expression Data with Visualization and the Bootstrap (with Katherine S. Pollard), U.C. Berkeley Division of Biostatistics Working Paper Series (2001)
Large-scale gene expression studies are coming increasingly common as new technologies make it possible to...
 

Gene Expression Analysis with the Parametric Bootstrap (with Jennifer F. Bryan), U.C. Berkeley Division of Biostatistics Working Paper Series (2000)
Recent developments in microarray technology make it possible to capture the gene expression profiles for...
 

Locally Efficient Estimation of a Multivariate Survival Function in Longitudinal Studies (with Alan E. Hubbard and James M. Robins), U.C. Berkeley Division of Biostatistics Working Paper Series (2000)
In this paper we develop a locally efficient one-step estimator of a multivariate survival function...
 

Statistical Computing

Multiple Testing Procedures: R multtest Package and Applications to Genomics (with Katherine S. Pollard and Sandrine Dudoit), U.C. Berkeley Division of Biostatistics Working Paper Series (2004)
The Bioconductor R package multtest implements widely applicable resampling-based single-step and stepwise multiple testing procedures...
 

Statistical Models

PDF

Multiple Testing and Data Adaptive Regression: An Application to HIV-1 Sequence Data. (with Merrill D. Birkner and Sandra E. Sinisi), Statistical Applications in Genetics and Molecular Biology (2006)
Analysis of viral strand sequence data and viral replication capacity could potentially lead to biological...
 

PDF

Deletion/Substitution/Addition Algorithm in Learning with Applications in Genomics (with Sandra E. Sinisi), Statistical Applications in Genetics and Molecular Biology (2006)
van der Laan and Dudoit (2003) provide a road map for estimation and performance assessment...
 

PDF

Cross-Validated Bagged Prediction of Survival (with Sandra E. Sinisi and Romain Neugebauer), Statistical Applications in Genetics and Molecular Biology (2006)
In this article, we show how to apply our previously proposed Deletion/Substitution/Addition algorithm in the...
 

PDF

Super Learning: an Application to Prediction of HIV-1 Drug Susceptibility (with Sandra E. Sinisi and Maya L. Petersen), U.C. Berkeley Division of Biostatistics Working Paper Series (2006)
Many statistical methods exist that can be used to learn a predictor based on observed...
 

PDF

Causal Effect Models for Intention to Treat and Realistic Individualized Treatment Rules, U.C. Berkeley Division of Biostatistics Working Paper Series (2006)

An important class of models in causal inference are the so-called marginal structural models which...

 

PDF

Multiple Tests of Association with Biological Annotation Metadata (with Sandrine Dudoit and Sunduz Keles), U.C. Berkeley Division of Biostatistics Working Paper Series (2006)