<?xml version="1.0" encoding="utf-8" ?>
<rss version="2.0">
<channel>
<title>Sherri Rose</title>
<copyright>Copyright (c) 2012  All rights reserved.</copyright>
<link>http://works.bepress.com/sherri_rose</link>
<description>Recent documents in Sherri Rose</description>
<language>en-us</language>
<lastBuildDate>Fri, 13 Jan 2012 07:59:17 PST</lastBuildDate>
<ttl>3600</ttl>








<item>
<title>Targeted Learning: Causal Inference for Observational and Experimental Data</title>
<link>http://works.bepress.com/sherri_rose/18</link>
<guid isPermaLink="true">http://works.bepress.com/sherri_rose/18</guid>
<pubDate>Fri, 08 Jul 2011 18:49:26 PDT</pubDate>
<description>
	<![CDATA[
	<p>The statistics profession is at a unique point in history. The need for valid statistical tools is greater than ever; data sets are massive, often measuring hundreds of thousands of measurements for a single subject. The field is ready to move toward clear objective benchmarks under which tools can be evaluated. Targeted learning allows (1) the full generalization and utilization of cross-validation as an estimator selection tool so that the subjective choices made by humans are now made by the machine, and (2) targeting the fitting of the probability distribution of the data toward the target parameter representing the scientific question of interest.</p>
<p>This book is aimed at both statisticians and applied researchers interested in causal inference and general effect estimation for observational and experimental data. Part I is an accessible introduction to super learning and the targeted maximum likelihood estimator, including related concepts necessary to understand and apply these methods. Parts II-IX handle complex data structures and topics applied researchers will immediately recognize from their own research, including time-to-event outcomes, direct and indirect effects, positivity violations, case-control studies, censored data, longitudinal data including time-dependent confounding, and genomic studies.</p>
<p>See targetedlearningbook.com.</p>

	]]>
</description>

<author>Mark J. van der Laan et al.</author>


<category>Causal Inference</category>

</item>






<item>
<title>Targeted Methods for Finding Quantitative Trait Loci</title>
<link>http://works.bepress.com/sherri_rose/17</link>
<guid isPermaLink="true">http://works.bepress.com/sherri_rose/17</guid>
<pubDate>Fri, 08 Jul 2011 18:42:50 PDT</pubDate>
<description>
	<![CDATA[
	<p>Conventional genetic mapping methods typically assume parametric models with Gaussian errors, and obtain parameter estimates through maximum likelihood estimation. We propose a general semiparametric model to map quantitative trait loci (QTL) in experimental crosses. In contrast with widely-used interval mapping (IM) derived methods, our model requires fewer assumptions and also accommodates various machine learning algorithms.  Estimation using both targeted maximum likelihood and collaborative targeted maximum likelihood methods is compared to a composite interval mapping (CIM) approach.  We demonstrate with simulations and real data analyses that, on average, our semiparametric targeted learning approach produces less biased QTL effect estimates than those from parametric models.</p>

	]]>
</description>

<author>Hui Wang et al.</author>


<category>Biology</category>

<category>Statistical Theory and Methods</category>

</item>






<item>
<title>Rose et al. Respond to “G-Computation and Standardization in Epidemiology”</title>
<link>http://works.bepress.com/sherri_rose/16</link>
<guid isPermaLink="true">http://works.bepress.com/sherri_rose/16</guid>
<pubDate>Sat, 19 Mar 2011 02:51:50 PDT</pubDate>
<description>
	<![CDATA[
	<p>We thank Vansteelandt and Keiding (1) for their commentary on our article (2), in which we implemented G-computation, a maximum likelihood-based substitution estimator of the G-formula. The goals of that article included 1) translating G-computation into the applied epidemiology literature by using a point-treatment example and marginal parameter, 2) drawing connections between traditional regression and G-computation, 3) demonstrating G-computation in a simple simulated data set, and 4) briefly presenting related topics, such as super learning (3, 4). Their commentary provides valuable background on G-computation that was outside the scope of our article. Standardization was addressed, albeit briefly, in our article, and we disagree that our chosen presentation of G-computation was divorced from the literature. We respond to their remaining commentary via a road map for effect estimation (4), which can be a useful component of epidemiologic analysis and can guide investigators to address issues raised by Vansteelandt and Keiding (1).</p>

	]]>
</description>

<author>Sherri Rose et al.</author>


<category>Causal Inference</category>

<category>Epidemiology</category>

<category>Statistical Theory and Methods</category>

</item>






<item>
<title>Implementation of G-Computation on a Simulated Data Set: Demonstration of a Causal Inference Technique</title>
<link>http://works.bepress.com/sherri_rose/15</link>
<guid isPermaLink="true">http://works.bepress.com/sherri_rose/15</guid>
<pubDate>Sat, 19 Mar 2011 02:47:28 PDT</pubDate>
<description>
	<![CDATA[
	<p>The growing body of work in the epidemiology literature focused on G-computation includes theoretical explanations of the method but very few simulations or examples of application. The small number of G-computation analyses in the epidemiology literature relative to other causal inference approaches may be partially due to a lack of didactic explanations of the method targeted toward an epidemiology audience. The authors provide a step-by-step demonstration of G-computation that is intended to familiarize the reader with this procedure. The authors simulate a data set and then demonstrate both G-computation and traditional regression to draw connections and illustrate contrasts between their implementation and interpretation relative to the truth of the simulation protocol. A marginal structural model is used for effect estimation in the G-computation example. The authors conclude by answering a series of questions to emphasize the key characteristics of causal inference techniques and the G-computation procedure in particular.</p>

	]]>
</description>

<author>Jonathan M. Snowden et al.</author>


<category>Causal Inference</category>

<category>Epidemiology</category>

</item>






<item>
<title>A Targeted Maximum Likelihood Estimator for Two-Stage Designs</title>
<link>http://works.bepress.com/sherri_rose/14</link>
<guid isPermaLink="true">http://works.bepress.com/sherri_rose/14</guid>
<pubDate>Fri, 11 Mar 2011 13:20:46 PST</pubDate>
<description>
	<![CDATA[
	<p>We consider two-stage sampling designs, including so-called nested case control studies, where one takes a random sample from a target population and completes measurements on each subject in the first stage. The second stage involves drawing a subsample from the original sample, collecting additional data on the subsample. This data structure can be viewed as a missing data structure on the full-data structure collected in the second-stage of the study.  Methods for analyzing two-stage designs include parametric maximum likelihood estimation and estimating equation methodology.  We propose an inverse probability of censoring weighted targeted maximum likelihood estimator (IPCW-TMLE) in two-stage sampling designs and present simulation studies featuring this estimator.</p>

	]]>
</description>

<author>Sherri Rose et al.</author>


<category>Causal Inference</category>

<category>Clinical Trials</category>

<category>Epidemiology</category>

<category>Statistical Models</category>

<category>Statistical Theory and Methods</category>

</item>






<item>
<title>Profiling Cys34 Adducts of Human Serum Albumin by Fixed-Step Selected Reaction Monitoring</title>
<link>http://works.bepress.com/sherri_rose/13</link>
<guid isPermaLink="true">http://works.bepress.com/sherri_rose/13</guid>
<pubDate>Fri, 14 Jan 2011 00:30:46 PST</pubDate>
<description>
	<![CDATA[
	<p>A method is described for profiling putative adducts (or other unknown covalent modifications) at the Cys34 locus of human serum albumin (HSA), which represents the preferred reaction site for small electrophilic species in human serum. By comparing profiles of putative HSA-Cys34 adducts across populations of interest it is theoretically possible to explore environmental causes of degenerative diseases and cancer caused by both exogenous and endogenous chemicals. We report a novel application of selected-reaction-monitoring (SRM) mass spectrometry, termed fixed-step SRM (FS-SRM), that allows detection of essentially all HSA-Cys34 modifications over a specified range of mass increases (added masses). After tryptic digestion, HSA-Cys34 adducts are contained in the third largest peptide (T3), which contains 21 amino acids and an average mass of 2433.87 Da. The FS-SRM method does not require that exact masses of T3 adducts be known in advance but rather uses a theoretical list of T3-adduct m/z values separated by a fixed increment of 1.5. In terms of added masses, each triply-charged parent ion represents a bin of (+-)2.3 Da between 9.1 Da and 351.1 Da. Synthetic T3 adducts were used to optimize FS-SRM and to establish screening rules based upon selected b- and y-series fragment ions. An isotopically labeled T3 adduct is added to protein digests to facilitate quantification of putative adducts. We used FS-SRM to generate putative adduct profiles from 6 archived specimens of HSA that had been pooled by gender, race, and smoking status. An average of 66 putative adduct hits (out of a possible 77) were detected in these samples. Putative adducts covered a wide range of concentrations, were most abundant in the mass range below 100 Da, and were more abundant in smokers than in nonsmokers. With minor modifications, the FS-SRM methodology can be applied to other nucleophilic sites and proteins.</p>

	]]>
</description>

<author>He Li et al.</author>


<category>Biology</category>

</item>






<item>
<title>Finding quantitative trait loci genes with collaborative targeted maximum likelihood learning</title>
<link>http://works.bepress.com/sherri_rose/12</link>
<guid isPermaLink="true">http://works.bepress.com/sherri_rose/12</guid>
<pubDate>Fri, 19 Nov 2010 16:11:00 PST</pubDate>
<description>
	<![CDATA[
	<p>Quantitative trait loci mapping is focused on identifying the positions and effect of genes underlying an an observed trait. We present a collaborative targeted maximum likelihood estimator in a semi-parametric model using a newly proposed 2-part super learning algorithm to ﬁnd quantitative trait loci genes in listeria data. Results are compared to the parametric composite interval mapping approach.</p>

	]]>
</description>

<author>Hui Wang et al.</author>


<category>Biology</category>

</item>






<item>
<title>Statistics Ready for a Revolution: Next Generation of Statisticians Must Build Tools for Massive Data Sets</title>
<link>http://works.bepress.com/sherri_rose/11</link>
<guid isPermaLink="true">http://works.bepress.com/sherri_rose/11</guid>
<pubDate>Tue, 07 Sep 2010 14:21:41 PDT</pubDate>
<description>
	<![CDATA[
	<p>The statistics profession has reached a tipping point. The need for valid statistical tools is greater than ever; data sets are massive, often measuring hundreds of thousands of measurements for a single subject. The field is ready for a revolution, one driven by clear, objective benchmarks by which tools can be evaluated.</p>

	]]>
</description>

<author>Mark J. van der Laan et al.</author>


<category>Media Publications</category>

</item>






<item>
<title>Effects of PON polymorphisms and haplotypes on molecular phenotype in Mexican-American mothers and children</title>
<link>http://works.bepress.com/sherri_rose/10</link>
<guid isPermaLink="true">http://works.bepress.com/sherri_rose/10</guid>
<pubDate>Fri, 26 Mar 2010 14:22:35 PDT</pubDate>
<description>
	<![CDATA[
	<p>Paraoxonase 1 (PON1) prevents oxidation of low-density lipoproteins and inactivates toxic oxon derivatives of organophosphate pesticides (OPs). More than 250 SNPs have been previously identified in the PON1 gene, yet studies of PON1 genetic variation focus primarily on a few promoter SNPs (-108, -162) and coding SNPs (192, 55). We sequenced the PON1 gene in 30 subjects from a Mexican-American birth cohort and identified 94 polymorphisms with minor allele frequencies >5%, including several novel variants (six SNPs, one insertion, and two deletions). Variants of the PON1 gene and three SNPs from PON2 and PON3 were genotyped in 700 children and mothers from the same cohort. PON1 phenotype was established using two substrate-specific assays: arylesterase (AREase) and paraoxonase (POase). Twelve PON1 and two PON2 polymorphisms were significantly associated with AREase activity, and 37 polymorphisms with POase activity; however, only nine were not in strong linkage disequilibrium (LD) with either PON1-108 or PON1192 (r2 > 0.20), SNPs with known effects on PON1 quantity and substrate-specific activity. Single tagSNPs PON155 and PON1192 accounted for similar ranges of AREase variation compared to haplotypes comprised of multiple SNPs within their haplotype blocks. However, PON155 explained 11-16% of POase activity, while six SNPs in the same haplotype block explained threefold more variance (36-56%). Although LD structure in the PON cluster seems similar between Mexicans and Caucasians, allele frequencies for many polymorphisms differed strikingly. Functional effects of PON genetic variation related to susceptibility to OPs and oxidative stress also differed by age and should be considered in protecting vulnerable subpopulations.</p>

	]]>
</description>

<author>Karen Huen et al.</author>


<category>Biology</category>

</item>






<item>
<title>Modelling the network of cell cycle transcription factors in the yeast Saccharomyces cerevisiae</title>
<link>http://works.bepress.com/sherri_rose/8</link>
<guid isPermaLink="true">http://works.bepress.com/sherri_rose/8</guid>
<pubDate>Thu, 08 Oct 2009 16:53:40 PDT</pubDate>
<description>
	<![CDATA[
	<p>Reverse-engineering regulatory networks is one of the central challenges for computational biology. Many techniques have been developed to accomplish this by utilizing transcription factor binding data in conjunction with expression data. Of these approaches, several have focused on the reconstruction of the cell cycle regulatory network of Saccharomyces cerevisiae. The emphasis of these studies has been to model the relationships between transcription factors and their target genes. In contrast, here we focus on reverse-engineering the network of relationships among transcription factors that regulate the cell cycle in S. cerevisiae.</p>

	]]>
</description>

<author>Shawn Cokus et al.</author>


<category>Biology</category>

</item>






<item>
<title>Ensuring the comparability of comparison groups: is randomization enough?</title>
<link>http://works.bepress.com/sherri_rose/7</link>
<guid isPermaLink="true">http://works.bepress.com/sherri_rose/7</guid>
<pubDate>Thu, 08 Oct 2009 16:47:13 PDT</pubDate>
<description>
	<![CDATA[
	<p>It is widely believed that baseline imbalances in randomized trials must necessarily be random. In fact, there is a type of selection bias that can cause substantial, systematic and reproducible baseline imbalances of prognostic covariates even in properly randomized trials. It is possible, given complete data, to quantify both the susceptibility of a given trial to this type of selection bias and the extent to which selection bias appears to have caused either observable or unobservable baseline imbalances. Yet, in articles reporting on randomized trials, it is uncommon to find either these assessments or the information that would enable a reader to conduct them. Nevertheless, there have been a few published reports that contain descriptions of either this type of selection bias or indicators that it may have occurred.  Objective: To document that the same type of selection bias has been described in numerous randomized trials and therefore that it represents a problem deserving of greater attention. Study selection: Computerized searches were not useful in locating trials with one or more elements that contribute to or are indicative of selection bias in randomized trials. We limit our treatment to trials that were previously questioned for susceptibility to selection bias or for large baseline imbalances. Results: We found 14 randomized trials that appear to be suspicious for selection bias. This may represent only the tip of the iceberg, because the status of other trials is inconclusive. Conclusions: Authors of clinical trial reports should be required to disclose sufficient details to allow for an assessment of both allocation concealment and selection bias. The extent to which a randomized study was susceptible to selection bias should be considered in determining the relative contribution it makes to any subsequent meta-analysis, policy or decision.</p>

	]]>
</description>

<author>Vance Berger et al.</author>


<category>Clinical Trials</category>

</item>






<item>
<title>Readings in Targeted Maximum Likelihood Estimation</title>
<link>http://works.bepress.com/sherri_rose/6</link>
<guid isPermaLink="true">http://works.bepress.com/sherri_rose/6</guid>
<pubDate>Thu, 08 Oct 2009 16:37:11 PDT</pubDate>
<description>
	<![CDATA[
	<p>This is a compilation of current and past work on targeted maximum likelihood estimation.  It features the original targeted maximum likelihood learning paper as well as chapters on super (machine) learning using cross validation, randomized controlled trials, realistic individualized treatment rules in observational studies, biomarker discovery, case-control studies, and time-to-event outcomes with censored data, among others.  We hope this collection is helpful to the interested reader and stimulates additional research in this important area.</p>

	]]>
</description>

<author>Mark J. van der Laan et al.</author>


<category>Causal Inference</category>

</item>






<item>
<title>Simple Optimal Weighting of Cases and Controls in Case-Control Studies</title>
<link>http://works.bepress.com/sherri_rose/5</link>
<guid isPermaLink="true">http://works.bepress.com/sherri_rose/5</guid>
<pubDate>Thu, 08 Oct 2009 16:37:11 PDT</pubDate>
<description>
	<![CDATA[
	<p>Researchers of uncommon diseases are often interested in assessing potential risk factors. Given the low incidence of disease, these studies are frequently case-control in design. Such a design allows a sufficient number of cases to be obtained without extensive sampling and can increase efficiency; however, these case-control samples are then biased since the proportion of cases in the sample is not the same as the population of interest. Methods for analyzing case-control studies have focused on utilizing logistic regression models that provide conditional and not causal estimates of the odds ratio. This article will demonstrate the use of the prevalence probability and case-control weighted targeted maximum likelihood estimation (MLE), as described by van der Laan (2008), in order to obtain causal estimates of the parameters of interest (risk difference, relative risk, and odds ratio). It is meant to be used as a guide for researchers, with step-by-step directions to implement this methodology. We will also present simulation studies that show the improved efficiency of the case-control weighted targeted MLE compared to other techniques.</p>

	]]>
</description>

<author>Sherri Rose et al.</author>


<category>Causal Inference</category>

</item>






<item>
<title>Why Match? Investigating Matched Case-Control Study Designs with Causal Effect Estimation</title>
<link>http://works.bepress.com/sherri_rose/3</link>
<guid isPermaLink="true">http://works.bepress.com/sherri_rose/3</guid>
<pubDate>Thu, 08 Oct 2009 16:37:10 PDT</pubDate>
<description>
	<![CDATA[
	<p>Matched case-control study designs are commonly implemented in the field of public health. While matching is intended to eliminate confounding, the main potential benefit of matching in case-control studies is a gain in efficiency. Methods for analyzing matched case-control studies have focused on utilizing conditional logistic regression models that provide conditional and not causal estimates of the odds ratio. This article investigates the use of case-control weighted targeted maximum likelihood estimation to obtain marginal causal effects in matched case-control study designs.  We compare the use of case-control weighted targeted maximum likelihood estimation in matched and unmatched designs in an effort to explore which design yields the most information about the marginal causal effect.  The procedures require knowledge of certain prevalence probabilities and were previously described by van der Laan (2008).  In many practical situations where a causal effect is the parameter of interest, researchers may be better served using an unmatched design.</p>

	]]>
</description>

<author>Sherri Rose et al.</author>


<category>Causal Inference</category>

</item>






<item>
<title>A Note on Risk Prediction for Case-Control Studies</title>
<link>http://works.bepress.com/sherri_rose/2</link>
<guid isPermaLink="true">http://works.bepress.com/sherri_rose/2</guid>
<pubDate>Thu, 08 Oct 2009 16:37:10 PDT</pubDate>
<description>
	<![CDATA[
	<p>We introduce a new method for prediction in case-control study designs, which is a simple extension of the work by van der Laan (2008). Case-control samples are biased since the proportion of cases in the sample is not the same as the population of interest. The case-control weighting for prediction proposed in this paper relies on knowledge of the true incidence probability P(Y=1) to eliminate the bias of the sampling design. In many practical settings, case-control weighting will outperform an existing method for prediction, intercept adjustment.</p>

	]]>
</description>

<author>Sherri Rose et al.</author>


<category>Prediction</category>

</item>





</channel>
</rss>

