Generation of Prediction Intervals to Assess Data Quality in the Distribute System Using Quantile Regression2011 Joint Statistical Meetings Proceedings
Document TypeConference Proceeding
AbstractDistribute is a national influenza-like-illness (ILI) surveillance project that integrates data from multiple jurisdictions. Distribute works solely with summarized (aggregated) data. Timeliness of the data varies considerably between sites; for many sites data for each encounter date arrives piecemeal, spread over several days. This spread adds additional noise into the data received by the Distribute system. Systematic differences in the timeliness between sources of data can introduce bias into the indicator of interest, the ILI ratio. Quantile regression using the observed relationship between incomplete and complete data is used to calculate prediction intervals for complete data. Some sites have very narrow prediction intervals that indicate the ILI-ratio calculated from incomplete data approximates the complete data ratio very accurately. Other sites show considerable asymmetry.
Citation InformationIan Painter, Julie Eaton, Debra Revere, Bill Lober, et al.. "Generation of Prediction Intervals to Assess Data Quality in the Distribute System Using Quantile Regression" 2011 Joint Statistical Meetings Proceedings (2011) p. 5172 - 5179
Available at: http://works.bepress.com/julie_eaton/2/