DTN: A Learning Rate Scheme with Convergence Rate of O(1/t) for SGD

01/22/2019
by   Lam M. Nguyen, et al.
0

We propose a novel diminishing learning rate scheme, coined Decreasing-Trend-Nature (DTN), which allows us to prove fast convergence of the Stochastic Gradient Descent (SGD) algorithm to a first-order stationary point for smooth general convex and some class of nonconvex including neural network applications for classification problems. We are the first to prove that SGD with diminishing learning rate achieves a convergence rate of O(1/t) for these problems. Our theory applies to neural network applications for classification problems in a straightforward way.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset