LM-CMA: an Alternative to L-BFGS for Large Scale Black-box Optimization
The limited memory BFGS method (L-BFGS) of Liu and Nocedal (1989) is often considered to be the method of choice for continuous optimization when first- and/or second- order information is available. However, the use of L-BFGS can be complicated in a black-box scenario where gradient information is not available and therefore should be numerically estimated. The accuracy of this estimation, obtained by finite difference methods, is often problem-dependent that may lead to premature convergence of the algorithm. In this paper, we demonstrate an alternative to L-BFGS, the limited memory Covariance Matrix Adaptation Evolution Strategy (LM-CMA) proposed by Loshchilov (2014). The LM-CMA is a stochastic derivative-free algorithm for numerical optimization of non-linear, non-convex optimization problems. Inspired by the L-BFGS, the LM-CMA samples candidate solutions according to a covariance matrix reproduced from m direction vectors selected during the optimization process. The decomposition of the covariance matrix into Cholesky factors allows to reduce the memory complexity to O(mn), where n is the number of decision variables. The time complexity of sampling one candidate solution is also O(mn), but scales as only about 25 scalar-vector multiplications in practice. The algorithm has an important property of invariance w.r.t. strictly increasing transformations of the objective function, such transformations do not compromise its ability to approach the optimum. The LM-CMA outperforms the original CMA-ES and its large scale versions on non-separable ill-conditioned problems with a factor increasing with problem dimension. Invariance properties of the algorithm do not prevent it from demonstrating a comparable performance to L-BFGS on non-trivial large scale smooth and nonsmooth optimization problems.
READ FULL TEXT