Skip to main content
Article
Hyperspacings and the estimation of information theoretic quantities
University of Massachusetts - Amherst Technical Report (2004)
  • Erik G Learned-Miller, University of Massachusetts - Amherst
Abstract

The estimation of probability densities from data is widely used as an intermediate step in the estimation of entropy, Kullback-Leibler (KL) divergence, and mutual information, and for statistical tasks such as hypothesis testing. We propose an alternative to density estimation– partitioning a space into regions whose approximate probability mass is known–that can be used for the same purposes. We call these regions hyperspacings, a generalization of spacings in one dimension. After discussing one-dimensional spacings estimates of entropy and KL-divergence, we show how hyperspacings can be used to estimate these quantities (and mutual information) in higher dimensions. Our approach outperforms certain widely used estimators based on intermediate density estimates. Using similar ideas, we also present a new distributionfree hypothesis test for distributional equivalence that compares favorably with the Kolmogorov-Smirnov test. Using hyperspacings, it is easily extended to multiple dimensions.

Disciplines
Publication Date
2004
Citation Information
Erik G Learned-Miller. "Hyperspacings and the estimation of information theoretic quantities" University of Massachusetts - Amherst Technical Report Vol. 04 Iss. 104 (2004)
Available at: http://works.bepress.com/erik_learned_miller/17/