In certain situations, syntactically valid, but incorrect, data entered into a database can result in nearimmediate, catastrophic financial losses for an organization. Examples include: omitting zeros in prices of goods on e-commerce sites; and financial fraud where data is directly entered into databases, bypassing application-level financial checks. Such "dangerous data" can, and should, be detected, because it deviates substantially from the statistical properties of existing data. Detection of this kind of problem requires comparing individual data items to a large amount of existing data in the database at run- Time. Furthermore, the identification of errors is probabilistic, rather than deterministic, in nature. This research proposes part-whole validation as an approach to addressing the dangerous data situation. Part-whole validation addresses fundamental issues in database management, for example, integrity maintenance. Illustrative and representative examples are first defined, and analyzed. Then, an architecture for part-whole validation is presented and implemented in a prototype to illustrate the feasibility of the research.
- Audit,
- Boyce-codd normal form,
- Business rules,
- Dangerous data,
- Data management,
- Data quality,
- Database design,
- Part-whole validation,
- Relational databases
Available at: http://works.bepress.com/cecil-chua/8/