Optimal Partitions for Nonparametric Multivariate Entropy Estimation
Efficient and accurate estimation of multivariate empirical probability distributions is fundamental to the calculation of information-theoretic measures such as mutual information and transfer entropy. Common techniques include variations on histogram estimation which, whilst computationally efficient, often fail to closely approximate the probability density functions - particularly for distributions with fat tails or fine substructure, or when sample sizes are small. This paper demonstrates that the application of rotation operations can improve entropy estimates by aligning the geometry of the partition to the sample distribution. A method for generating equiprobable multivariate histograms is presented, using recursive binary partitioning, for which optimal rotations are found. Such optimal partitions were observed to be more accurate than existing techniques in estimating entropies of correlated bivariate Gaussian distributions with known theoretical values, across varying sample sizes (99% CI).
READ FULL TEXT