Stochastic quasi-Newton with adaptive step lengths for large-scale problems
We provide a numerically robust and fast method capable of exploiting the local geometry when solving large-scale stochastic optimisation problems. Our key innovation is an auxiliary variable construction coupled with an inverse Hessian approximation computed using a receding history of iterates and gradients. It is the Markov chain nature of the classic stochastic gradient algorithm that enables this development. The construction offers a mechanism for stochastic line search adapting the step length. We numerically evaluate and compare against current state-of-the-art with encouraging performance on real-world benchmark problems where the number of observations and unknowns is in the order of millions.
READ FULL TEXT