A top-k query shortlists the k records in a dataset that best match the user's preferences. To indicate her preferences, the user typically determines a numeric weight for each data dimension (i.e., attribute). We refer to these weights collectively as the query vector. Based on this vector, each data record is implicitly mapped to a score value (via a weighted sum function). The records with the k largest scores are reported as the result. In this paper we propose an auxiliary feature to standard top-k query processing. Specifically, we compute the maximal locus within which the query vector incurs no change in the current top-k result. In other words, we compute all possible query weight settings that produce exactly the same top-k result as the user's original query. We call this locus the global immutable region (GIR). The GIR can be used as a guide to query vector readjustments, as a sensitivity measure for the top-k result, as well as to enable effective result caching. We develop efficient algorithms for GIR computation, and verify their robustness using a variety of real and synthetic datasets.
- Sensitivity analysis,
- Top-k search,
- algorithms,
- Data dimensions,
- Query vectors,
- Sensitivity measures,
- Synthetic datasets,
- Top-k query processing,
- User's preferences,
- Weight setting
Available at: http://works.bepress.com/hweehwa-pang/36/