Optimality and Stability in Non-Convex-Non-Concave Min-Max Optimization
Convergence to a saddle point for convex-concave functions has been studied for decades, while the last few years have seen a surge of interest in non-convex-non-concave min-max optimization due to the rise of deep learning. However, it remains an intriguing research challenge how local optimal points are defined and which algorithm can converge to such points. We study definitions of "local min-max (max-min)" points and provide an elegant unification, with the corresponding first- and second-order necessary and sufficient conditions. Specifically, we show that quadratic games, as often used as illustrative examples and approximations of smooth functions, are too special, both locally and globally. Lastly, we analyze the exact conditions for local convergence of several popular gradient algorithms near the "local min-max" points defined in the previous section, identify "valid" hyper-parameters and compare the respective stable sets. Our results offer insights into the necessity of two-time-scale algorithms and the limitation of the commonly used approach based on ordinary differential equations.
READ FULL TEXT