Stochastic gradient descent algorithms for strongly convex functions at O(1/T) convergence rates

05/09/2013
by   Shenghuo Zhu, et al.
0

With a weighting scheme proportional to t, a traditional stochastic gradient descent (SGD) algorithm achieves a high probability convergence rate of O(κ/T) for strongly convex functions, instead of O(κ ln(T)/T). We also prove that an accelerated SGD algorithm also achieves a rate of O(κ/T).

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset