A sharp uniform-in-time error estimate for Stochastic Gradient Langevin Dynamics
We establish a sharp uniform-in-time error estimate for the Stochastic Gradient Langevin Dynamics (SGLD), which is a popular sampling algorithm. Under mild assumptions, we obtain a uniform-in-time O(η^2) bound for the KL-divergence between the SGLD iteration and the Langevin diffusion, where η is the step size (or learning rate). Our analysis is also valid for varying step sizes. Based on this, we are able to obtain an O(η) bound for the distance between the SGLD iteration and the invariant distribution of the Langevin diffusion, in terms of Wasserstein or total variation distances.
READ FULL TEXT