Implicit stochastic approximation

10/04/2015
by   Panos Toulis, et al.
0

The need to carry out parameter estimation from massive data has reinvigorated interest in iterative estimation methods, in statistics and machine learning. Classic work includes deterministic gradient-based methods, such as quasi-Newton, and stochastic gradient descent and its variants, including adaptive learning rates, acceleration and averaging. Current work increasingly relies on methods that employ proximal operators, leading to updates defined through implicit equations, which need to be solved at each iteration. Such methods are especially attractive in modern problems with massive data because they are numerically stable and converge with minimal assumptions, among other reasons. However, while the majority of existing methods can be subsumed into the gradient-free stochastic approximation framework developed by Robbins and Monro (1951), there is no such framework for methods with implicit updates. Here, we conceptualize a gradient-free implicit stochastic approximation procedure, and develop asymptotic and non-asymptotic theory for it. This new framework provides a theoretical foundation for gradient-based procedures that rely on implicit updates, and opens the door to iterative estimation methods that do not require a gradient, nor a fully known likelihood.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset