<?xml version="1.0" encoding="utf-8" ?>
<rss version="2.0">
<channel>
<title>Associate Professor Robert Graham Clark</title>
<copyright>Copyright (c) 2012  All rights reserved.</copyright>
<link>http://works.bepress.com/rclark</link>
<description>Recent documents in Associate Professor Robert Graham Clark</description>
<language>en-us</language>
<lastBuildDate>Tue, 11 Dec 2012 21:55:27 PST</lastBuildDate>
<ttl>3600</ttl>








<item>
<title>Person-level and household-level regression estimation in household surveys</title>
<link>http://works.bepress.com/rclark/17</link>
<guid isPermaLink="true">http://works.bepress.com/rclark/17</guid>
<pubDate>Tue, 13 Nov 2012 20:40:27 PST</pubDate>
<description>
	<![CDATA[
	<p>A common class of survey designs involves selecting all people within selected households. Generalized regressionestimators can be calculated at either the person or household level. Implementing the estimator at the household level has the convenience of equal estimation weights for people within households. In this article the two approaches are compared theoretically and empirically for the case of simple random sampling of households and selection of all persons in each selected household. We find that the household level approach is theoretically more efficient in large samples and any empirical inefficiency in small samples is limited.</p>

	]]>
</description>

<author>David G. Steel et al.</author>


</item>






<item>
<title>Oxygen exchange during the reaction of POCl3 and water</title>
<link>http://works.bepress.com/rclark/16</link>
<guid isPermaLink="true">http://works.bepress.com/rclark/16</guid>
<pubDate>Tue, 13 Nov 2012 20:40:26 PST</pubDate>
<description>
	<![CDATA[
	<p>To investigate O exchange during the reaction of POCl3 and water, natural abundance POCl3 was reacted with water highly enriched in 18O, and the resulting H3PO4 was isolated as KH2PO4. This reaction was conducted with and without tetrahydrofuran (THF) as a solvent, and was controlled in THF and violent in its absence. Approximately 5 x 10-4M aqueous solutions of the KH2PO4 were analyzed using electrospray ionization mass spectrometry, to estimate the proportions of the mass-clumped 16,17,18O isotope analogues of [H2PO4] -. During analysis, ~29% of [H2PO4] - dehydrated to [PO3]-, for which the proportions of the O isotope analogues were also measured. These proportions were compared with those predicted for O exchange at either four or three positions on the P atom of POCl3. The data strongly support O exchange at all four positions, whether or not THF was used to moderate conditions during the reaction. This result clears the way for safe, predictable synthesis of heavy-O labelled orthophosphate from POCl3 and 18O enriched water for evaluation as an environmental and biochemical tracer. Copyright CSIRO 2011.</p>

	]]>
</description>

<author>Robert G. Clark et al.</author>


</item>






<item>
<title>Robust Resampling Confidence Intervals for Empirical Variograms</title>
<link>http://works.bepress.com/rclark/15</link>
<guid isPermaLink="true">http://works.bepress.com/rclark/15</guid>
<pubDate>Tue, 13 Nov 2012 20:40:24 PST</pubDate>
<description>
	<![CDATA[
	<p>The variogram function is an important measure of the spatial dependenciesof a geostatistical or other spatial dataset. It plays a central role in kriging, designingspatial studies, and in understanding the spatial properties of geological andenvironmental phenomena. It is therefore important to understand the variability attachedto estimates of the variogram. Existing methods for constructing confidenceintervals around the empirical variogram either rely on strong assumptions, such asnormality or known variogram function, or are based on resampling blocks and subjectto edge effect biases. This paper proposes two new procedures for addressingthese concerns: a quasi-block-bootstrap and a quasi-block-jackknife. The new methodsare based on transforming the data to decorrelate it based on a fitted variogrammodel, resampling blocks from the decorrelated data, and then recorrelating. Thecoverage properties of the new confidence intervals are compared by simulation to anumber of existing resampling-based intervals. The proposed quasi-block-jackknifeconfidence interval is found to have the best properties of all of the methods consideredacross a range of scenarios, including normally and lognormally distributed dataand misspecification of the variogram function used to decorrelate the data.</p>

	]]>
</description>

<author>Robert G. Clark et al.</author>


</item>






<item>
<title>Design and Analysis of Clustered, Unmatched Resource Selection Studies</title>
<link>http://works.bepress.com/rclark/14</link>
<guid isPermaLink="true">http://works.bepress.com/rclark/14</guid>
<pubDate>Wed, 03 Aug 2011 21:52:43 PDT</pubDate>
<description>
	<![CDATA[
	<p>Studies which measure animals' positions over time are a vital tool in understanding the process of resource selection by animals. By comparing a sample of locations used by animals with a sample of available points, the types of locations preferred by animals can be analysed using logistic regression. Random effects logistic regression has been proposed to deal with the repeated measurements observed for each animal, but we find that this is not feasible in studies where the sample of available points cannot readily be matched to specific animals. Instead, this paper investigates the use of marginal logistic models with robust variance estimators, using a study of Australian bush rats as a case study. Simulation is used to check the properties of the approach and to explore alternative designs.</p>

	]]>
</description>

<author>Robert Graham Clark et al.</author>


</item>






<item>
<title>Comments on Sample Design for Proposed Australian Asthma Survey</title>
<link>http://works.bepress.com/rclark/13</link>
<guid isPermaLink="true">http://works.bepress.com/rclark/13</guid>
<pubDate>Wed, 03 Aug 2011 21:52:41 PDT</pubDate>
<description>
	<![CDATA[
	<p>The proposed design for the Australian Asthma Survey involves: a phone or face-toface screening interview with approximately 20,000 responding adults, followed by an in-depth interview and objective testing of all asthmatics and 1/10th of nonasthmatics in the screen. This report elaborates on sample design options based on the aims and approaches in the Australian Asthma Survey Proposal. The main requirement affecting the sample design is the need for a relatively small number of objective testing centres to be able to service the whole sample. This report considered a number of options where the sample was clustered in only 25 Statistical Local Areas (SLAs), as well as other options. This will facilitate the objective testing with the penalty of a significant loss of precision for other statistics. The priority and operational process for objective testing should therefore be carefully thought through. In particular, the cost of objective testing should be estimated under different scenarios including selecting 50 SLAs rather than 25. There are a number of options for conducting the screening sample and subselection for the in-depth survey. These include screening all adults or just one adult in each household. One attractive option is to select one adult from each household for the screening interview, then to conduct the selection for the in-depth survey and if possible the in-depth survey interview as part of the same call. Other options exist which are more complex but which give good results with a smaller screening sample. The survey will also include children (at most one child per household) but this report focuses on the adult sample. The major issues to be considered in finalising the sample design are: • clustering of the sample to accommodate the objective testing process (in particular the number of SLAs); • method of conducting screening sample (one adult or all adults per household); • method of selection and conduct of in-depth survey, in particular can this be done by interviewers in one visit or phone call; • whether the screen and in-depth survey will be conducted by face to face or telephone interviewing. Other detailed issues will include: the stratification and selection method for SLAs; the number of CDs to select if face-to-face interviewing is used; the method of selecting CDs, the method of selecting households within CDs; weighting and variance estimation. Section 2 of this report summarises some relevant terms and concepts in comparing sample designs. Section 3 describes three alternative approaches for conducting the screen and in-depth surveys. Section 4 compares six alternative sample designs including alternative approaches to clustering the sample. Appendix 1 contains some detailed comments on the Proposal, Appendix 2 elaborates on one of the sample design options in more detail and Appendix 3 contains some comments on weighting.</p>

	]]>
</description>

<author>Robert Graham Clark</author>


</item>






<item>
<title>The Effect of using Household as a Sampling Unit</title>
<link>http://works.bepress.com/rclark/11</link>
<guid isPermaLink="true">http://works.bepress.com/rclark/11</guid>
<pubDate>Wed, 03 Aug 2011 21:52:35 PDT</pubDate>
<description>
	<![CDATA[
	<p>The effect of sampling people through households is considered. Results on design effects for two stage surveys are reviewed and applied to give design effects of household samples. The main factors that determine the design effect are identified for the designs in which one person, or all people, are selected from each selected household.Within household correlation is one factor.We show that the relationships between household size and the mean and variance within households are also important factors. Census and survey data are used to empirically compare the design effects for a range of estimators, variables and designs.</p>

	]]>
</description>

<author>Robert Graham Clark et al.</author>


</item>






<item>
<title>Adaptive Calibration for Prediction of Finite Population Totals</title>
<link>http://works.bepress.com/rclark/10</link>
<guid isPermaLink="true">http://works.bepress.com/rclark/10</guid>
<pubDate>Wed, 03 Aug 2011 21:52:33 PDT</pubDate>
<description>
	<![CDATA[
	<p>Sample weights can be calibrated to reflect the known population totals of a set of auxiliary variables. Predictors of finite population totals calculated using these weights have low bias if the these variables are related to the variable of interest, but can have high variance if too many auxiliary variables are used. This article develops an "adaptive calibration" approach, where the auxiliary variables to be used in weighting are selected using sample data. Adaptively calibrated estimators are shown to have lower mean squared error and better coverage properties than non-adaptive estimators in many cases.</p>

	]]>
</description>

<author>Robert Graham Clark et al.</author>


</item>






<item>
<title>Sample design and estimation for household surveys</title>
<link>http://works.bepress.com/rclark/9</link>
<guid isPermaLink="true">http://works.bepress.com/rclark/9</guid>
<pubDate>Wed, 03 Aug 2011 21:52:31 PDT</pubDate>
<description>
	<![CDATA[
	<p>Household surveys are a widely-used tool for obtaining information about a population of people.  A sample of households is selected followed by a sample of people within selected households.  Households exhibit structure with variables measured on people in the same household often being dependent.  Household sizes vary significantly and the strength of dependencies within a household may depend on its size.  Traditional sample design and estimation methods ignore these dependencies.  Methodologies which explicitly allow for the dependencies which may arise within households are developed in four areas: •	Estimating the design effects of standard estimators, for survey design; •	Constructing new estimators to exploit dependencies within households; •	Selecting a set of auxiliary variables to use in regression estimation; •	Allocating the sample sizes of households and of people within households.  In each case, new methods will be developed which allow for the population structure of people within households.  The new methods will be compared theoretically and numerically to existing methods to show whether, and under what conditions, it is worthwhile explicitly allowing for this population structure.  This thesis is concerned with the sampling error of estimators of population totals; that is, the error due to selecting only a sample and not the whole population of people.  The model-assisted framework will be used.  The thesis finds that the population structure of people within households can be exploited to give several useful innovations.  It is shown that, in estimating the variance at the sample design stage, the variation of household size must be considered.  This variation is ignored in existing methods for estimating the design effect, and a more accurate method is developed.  It is found that minor improvements can be made to standard estimators of total by considering within-household dependencies.  An “integrated weighting” method, based on a linear contextual model, which has important practical advantages is found to often have slightly lower variance than non-integrated methods, contrary to common belief.  Existing criteria for selecting which auxiliary variables to use in regression estimation are extended to the case of two-stage sampling, and applied to household surveys.  In most household surveys, either one person or all people are selected from each selected household.  More general designs, in which the number of people selected is a function of the number of people in the household, are developed.  The fact that the number of people in a household is small leads to some novel and efficient sample designs and estimators.</p>

	]]>
</description>

<author>Robert Graham Clark</author>


</item>






<item>
<title>Adaptive Inference for Multi-Stage Survey Data</title>
<link>http://works.bepress.com/rclark/8</link>
<guid isPermaLink="true">http://works.bepress.com/rclark/8</guid>
<pubDate>Wed, 03 Aug 2011 21:52:29 PDT</pubDate>
<description>
	<![CDATA[
	<p>Two-stage sampling usually leads to higher variances for estimators of means and regression coefficients, because of intra-cluster homogeneity. One way of allowing for clustering in fitting a linear regression model is to use a linear mixed model with two levels. If the estimated intra-cluster correlation is close to zero, it may be acceptable to ignore clustering and use a single level model.  In this paper an adaptive strategy is evaluated for estimating the variances of estimated regression coefficients. The strategy is based on testing the null hypothesis that random effect variance component is zero. If this hypothesis is accepted the estimated variances of estimated regression coefficients are extracted from the one-level linear model. Otherwise, the estimated variance is based on the linear mixed model, or, alternatively the Huber-White robust variance estimator is used.  A simulation study is used to show that the adaptive approach provides reasonably correct inference in a simple case.</p>

	]]>
</description>

<author>L. M. Al-Zou&apos;bi et al.</author>


</item>






<item>
<title>Sampling within households in household surveys</title>
<link>http://works.bepress.com/rclark/6</link>
<guid isPermaLink="true">http://works.bepress.com/rclark/6</guid>
<pubDate>Wed, 03 Aug 2011 21:52:24 PDT</pubDate>
<description>
	<![CDATA[
	<p>The number of people to select within selected households has significant consequences for the conduct and output of household surveys. The operational and data quality implications of this choice are carefully considered in many surveys, but the impact on statistical efficiency is not well understood. The usual approach is to select all people in each selected household, where operational and data quality concerns make this feasible. If not, one person is usually selected from each selected household. We find that this strategy is not always justified, and develop intermediate designs between these two extremes. Current practices were developed when household survey field procedures needed to be simple and robust, however more complex designs are now feasible due to the increasing use of computer-assisted interviewing. We develop more flexible designs by optimising survey cost, based on a simple cost model, subject to a required variance for an estimator of population total. The innovation lies in the fact that household sample sizes are small integers, which creates challenges in both design and estimation. The new methods are evaluated empirically using census and health survey data, showing considerable improvement over existing methods in some cases.</p>

	]]>
</description>

<author>Robert Graham Clark et al.</author>


</item>






<item>
<title>Preliminary sample design for the New Zealand Health Survey 2010</title>
<link>http://works.bepress.com/rclark/5</link>
<guid isPermaLink="true">http://works.bepress.com/rclark/5</guid>
<pubDate>Wed, 03 Aug 2011 21:52:22 PDT</pubDate>
<description>
	<![CDATA[
	<p>This report describes the choice of the preliminary design for the New Zealand Health Survey, to be implemented from 2011. The survey will use computer assisted personal interviewing. The sample will be selected using a multi-stage area design. The selected sample size will be around 12,000 people per year. This is envisaged as sufficient to provide adequate precision for estimates of key prevalences for adults and children.  The main objectives of the sample design are: • The design should support analysis of the survey by multiple users, which implies avoiding great variation in estimation weights. • Estimates for children and adults are required. • A range of prevalences are to be estimated.   These include health behaviours and health conditions. • Estimates by ethnic group are required. Māori estimates are the most important of these and Pacifica and Asian estimates are also required. Estimates by ethnic group (Māori, Pacific and Asian) are a particular priority. A typical multi-stage area-based design would not give adequate sample in these groups. Ensuring adequate estimates from these subpopulations, while preserving precision at the national level, was the main focus of this sample design. Two main strategies will be used to increase the effective sample sizes for these populations: • A dual frame approach will be used. An area-based sample from NZ as a whole will be combined with a list-based sample of addresses on the Electoral Roll, to boost Māori sample size, subject to successful testing of this approach. • The area-based sample will be targeted towards the subpopulations of interest, by assigning higher probabilities of selection to meshblocks with higher concentrations of these groups.  Sections 2 and 3 of this report describe the main elements of the area-based sample design and the list-based Electoral Roll sample design, respectively. Section 4 summarises sample sizes for the preliminary design.   Section 5 outlines other issues which tenderers may need to consider. Appendix 1 details how the design settings in Sections 2 and 3 were derived, and Appendix 2 has more detailed tables on DHB sample sizes and standard errors for three of the design options that were considered.</p>

	]]>
</description>

<author>Robert Graham Clark</author>


</item>






<item>
<title>Sampling for Subpopulations in Two-Stage Surveys</title>
<link>http://works.bepress.com/rclark/4</link>
<guid isPermaLink="true">http://works.bepress.com/rclark/4</guid>
<pubDate>Wed, 03 Aug 2011 21:52:20 PDT</pubDate>
<description>
	<![CDATA[
	<p>Many national household interview surveys aim to produce statistics on small subpopulations, such as specific ethnic groups or the indigenous population of a country. In most countries, there is no reliable frame of the sub-populations of interest, so it is necessary to sample from the general population, which can be very expensive. The most common strategies used in practice for sampling rare subpopulations are the use of a large screening sample, and dis-proportionate sampling by strata. Optimal sample designs have been derived for the case of one-stage sampling, but most household surveys use two or more stages of selection. This paper develops optimal designs for two-stage sampling, where there is auxiliary information on subpopulation membership for each primary sampling unit. Various alternative designs are evaluated using a simulated population derived from the New Zealand Census.</p>

	]]>
</description>

<author>Robert Graham Clark</author>


</item>






<item>
<title>Accounting for the uncertainty of information on clustering in the design of a clustered sample</title>
<link>http://works.bepress.com/rclark/2</link>
<guid isPermaLink="true">http://works.bepress.com/rclark/2</guid>
<pubDate>Wed, 03 Aug 2011 21:52:16 PDT</pubDate>
<description>
	<![CDATA[
	<p>An important decision that has to be made in developing the design of a cluster or multi-stage sampling scheme is the number of units to select at each stage of selection. For a two-stage design we need to decide the number of units to select from each Primary Sampling Unit (PSU) in the sample. A common approach is to estimate the costs and the variance components associated with each stage of selection and determine an optimal design. This is usually done for estimates of the means or totals of one or a small number of variables. In practice the measure of intra-cluster homogeneity, which is the ratio of the variance components, needs to be estimated from a pilot study or historical data. There may be considerable uncertainty about the intra-cluster correlation. The parameter can be close to zero and the estimate may even not differ significantly from zero, however a design based on zero intra-cluster correlation would be highly clustered and sensitive to any failure of this assumption. This paper considers the effect of uncertainty about the intra-cluster correlation and other relevant population parameters on sample design. We develop an approach to assess this uncertainty using a Bayesian bootstrap method.</p>

	]]>
</description>

<author>David Steel et al.</author>


</item>






<item>
<title>Further Simulation Results on Resampling Confidence Intervals for Empirical Variograms</title>
<link>http://works.bepress.com/rclark/1</link>
<guid isPermaLink="true">http://works.bepress.com/rclark/1</guid>
<pubDate>Wed, 03 Aug 2011 21:52:14 PDT</pubDate>
<description>
	<![CDATA[
	<p>Clark and Allingham (2010) described and evaluated a number of replication-based confidence intervals for the binned empirical variogram. All were based on fiing an exponential variogram model to two-dimensional spatial data. This article will hereafter be referred to as CA10. CA10 evaluated the coverage of the various confince intervals by simulating spatial data. Datasets were simulated using multivariate normal or lognormal distributions, with the exponential or Gaussian variogram, with diffing efftive ranges. The Gaussian vario- gram was included to assess the robustness of the intervals to the assumption that spatial correlations followed an exponential variogram model.  This note further explores the robustness of the confince intervals to mis-specifition of the variogram, by extending the simulations of CA10 to include cubic and spherical variogram models. Confince intervals were calculated based on an assumption of an exponential variogram model, as in CA10.</p>

	]]>
</description>

<author>Robert Graham Clark et al.</author>


</item>





</channel>
</rss>
