Generation of Prediction Intervals to Assess Data Quality in the Distribute System Using Quantile Regression

Publication Date

12-1-2011

Document Type

Conference Proceeding

Abstract

Distribute is a national influenza-like-illness (ILI) surveillance project that integrates data from multiple jurisdictions. Distribute works solely with summarized (aggregated) data. Timeliness of the data varies considerably between sites; for many sites data for each encounter date arrives piecemeal, spread over several days. This spread adds additional noise into the data received by the Distribute system. Systematic differences in the timeliness between sources of data can introduce bias into the indicator of interest, the ILI ratio. Quantile regression using the observed relationship between incomplete and complete data is used to calculate prediction intervals for complete data. Some sites have very narrow prediction intervals that indicate the ILI-ratio calculated from incomplete data approximates the complete data ratio very accurately. Other sites show considerable asymmetry.

Publication Title

2011 Joint Statistical Meetings Proceedings

First Page

5172

Last Page

5179

This document is currently not available here.

Find in your library

Share

COinS