Escape saddle points by a simple gradient-descent based algorithm
Escaping saddle points is a central research topic in nonconvex optimization. In this paper, we propose a simple gradient-based algorithm such that for a smooth function fℝ^n→ℝ, it outputs an ϵ-approximate second-order stationary point in Õ(log n/ϵ^1.75) iterations. Compared to the previous state-of-the-art algorithms by Jin et al. with Õ((log n)^4/ϵ^2) or Õ((log n)^6/ϵ^1.75) iterations, our algorithm is polynomially better in terms of log n and matches their complexities in terms of 1/ϵ. For the stochastic setting, our algorithm outputs an ϵ-approximate second-order stationary point in Õ((log n)^2/ϵ^4) iterations. Technically, our main contribution is an idea of implementing a robust Hessian power method using only gradients, which can find negative curvature near saddle points and achieve the polynomial speedup in log n compared to the perturbed gradient descent methods. Finally, we also perform numerical experiments that support our results.
READ FULL TEXT