A Framework for Management of Semistructured Probabilistic Data
Journal of Intelligent Information Systems
  • Wenzhong Zhao, University of Kentucky
  • Alex Dekhtyar, University of Kentucky
  • Judy Goldsmith, University of Kentucky
Publication Date

This paper describes the theoretical framework and implementation of a database management system for storing and manipulating diverse probability distributions of discrete random variables with finite domains, and associated information. A formal Semistructured Probabilistic Object (SPO) data model and a Semistructured Probabilistic Query Algebra (SP-algebra) are proposed. The SP-algebra supports standard database queries as well as some specific to probabilities, such as conditionalization and marginalization. Thus, the Semistructured Probabilistic Database may be used as a backend to any application that involves the management of large quantities of probabilistic information, such as building stochastic models. The implementation uses XML encoding of SPOs to facilitate communication with diverse applications. The database management system has been implemented on top of a relational DBMS. The translation of SP-algebra queries into relational queries are discussed here, and the results of initial experiments evaluating the system are reported.

Wenzhong Zhao, Alex Dekhtyar and Judy Goldsmith. "A Framework for Management of Semistructured Probabilistic Data" Journal of Intelligent Information Systems Vol. 25 Iss. 3 (2005) p. 293 - 332
