Skip to main content
Adapting to Source Properties in Processing Data Integration Queries
Departmental Papers (CIS)
  • Zachary G Ives, University of Pennsylvania
  • Alon Y Halevy, University of Washington
  • Daniel S Weld, University of Washington
Date of this Version
Document Type
Conference Paper
Copyright ACM, 2004. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in Proceedings of the 2004 ACM SIGMOD International Conference on Management of Data, pages 395-406.
Publisher URL:
An effective query optimizer finds a query plan that exploits the characteristics of the source data. In data integration, little is known in advance about sources’ properties, which necessitates the use of adaptive query processing techniques to adjust query processing on-the-fly. Prior work in adaptive query processing has focused on compensating for delays and adjusting for mis-estimated cardinality or selectivity values. In this paper, we present a generalized architecture for adaptive query processing and introduce a new technique, called adaptive data partitioning (ADP), which is based on the idea of dividing the source data into regions, each executed by different, complementary plans. We show how this model can be applied in novel ways to not only correct for underestimated selectivity and cardinality values, but also to discover and exploit order in the source data, and to detect and exploit source data that can be effectively pre-aggregated. We experimentally compare a number of alternative strategies and show that our approach is effective.
Citation Information
Zachary G Ives, Alon Y Halevy and Daniel S Weld. "Adapting to Source Properties in Processing Data Integration Queries" (2004)
Available at: