Between hard and soft thresholding: optimal iterative thresholding algorithms
Iterative thresholding algorithms seek to optimize a differentiable objective function over a sparsity or rank constraint by alternating between gradient steps that reduce the objective, and thresholding steps that enforce the constraint. This work examines the choice of the thresholding operator, and asks whether it is possible to achieve stronger guarantees than what is possible with hard thresholding. We develop the notion of relative concavity of a thresholding operator, a quantity that characterizes the convergence performance of any thresholding operator on the target optimization problem. Surprisingly, we find that commonly used thresholding operators, such as hard thresholding and soft thresholding, are suboptimal in terms of convergence guarantees. Instead, a general class of thresholding operators, lying between hard thresholding and soft thresholding, is shown to be optimal with the strongest possible convergence guarantee among all thresholding operators. Examples of this general class includes ℓ_q thresholding with appropriate choices of q, and a newly defined reciprocal thresholding operator. As a byproduct of the improved convergence guarantee, these new thresholding operators improve on the best known upper bound for prediction error of both iterative hard thresholding and Lasso in terms of the dependence on condition number in the setting of sparse linear regression.
READ FULL TEXT