Optimized Algorithms to Sample Determinantal Point Processes
In this technical report, we discuss several sampling algorithms for Determinantal Point Processes (DPP). DPPs have recently gained a broad interest in the machine learning and statistics literature as random point processes with negative correlation, i.e., ones that can generate a "diverse" sample from a set of items. They are parametrized by a matrix L, called L-ensemble, that encodes the correlations between items. The standard sampling algorithm is separated in three phases: 1/ eigendecomposition of L, 2/ an eigenvector sampling phase where L's eigenvectors are sampled independently via a Bernoulli variable parametrized by their associated eigenvalue, 3/ a Gram-Schmidt-type orthogonalisation procedure of the sampled eigenvectors. In a naive implementation, the computational cost of the third step is on average O(Nμ^3) where μ is the average number of samples of the DPP. We give an algorithm which runs in O(Nμ^2) and is extremely simple to implement. If memory is a constraint, we also describe a dual variant with reduced memory costs. In addition, we discuss implementation details often missing in the literature.
READ FULL TEXT