Wasserstein Variational Gradient Descent: From Semi-Discrete Optimal Transport to Ensemble Variational Inference
Particle-based variational inference offers a flexible way of approximating complex posterior distributions with a set of particles. In this paper we introduce a new particle-based variational inference method based on the theory of semi-discrete optimal transport. Instead of minimizing the KL divergence between the posterior and the variational approximation, we minimize a semi-discrete optimal transport divergence. The solution of the resulting optimal transport problem provides both a particle approximation and a set of optimal transportation densities that map each particle to a segment of the posterior distribution. We approximate these transportation densities by minimizing the KL divergence between a truncated distribution and the optimal transport solution. The resulting algorithm can be interpreted as a form of ensemble variational inference where each particle is associated with a local variational approximation.
READ FULL TEXT