Mixture weights optimisation for Alpha-Divergence Variational Inference
This paper focuses on α-divergence minimisation methods for Variational Inference. More precisely, we are interested in algorithms optimising the mixture weights of any given mixture model, without any information on the underlying distribution of its mixture components parameters. The Power Descent, defined for all α≠ 1, is one such algorithm and we establish in our work the full proof of its convergence towards the optimal mixture weights when α <1. Since the α-divergence recovers the widely-used forward Kullback-Leibler when α→ 1, we then extend the Power Descent to the case α = 1 and show that we obtain an Entropic Mirror Descent. This leads us to investigate the link between Power Descent and Entropic Mirror Descent: first-order approximations allow us to introduce the Renyi Descent, a novel algorithm for which we prove an O(1/N) convergence rate. Lastly, we compare numerically the behavior of the unbiased Power Descent and of the biased Renyi Descent and we discuss the potential advantages of one algorithm over the other.
READ FULL TEXT