From Random Search to Bandit Learning in Metric Measure Spaces
Random Search is one of the most widely-used method for Hyperparameter Optimization, and is critical to the success of deep learning models. Despite its astonishing performance, little non-heuristic theory has been developed to describe the underlying working mechanism. This paper gives a theoretical accounting of Random Search. We introduce the concept of scattering dimension that describes the landscape of the underlying function, and quantifies the performance of random search. We show that, when the environment is noise-free, the output of random search converges to the optimal value in probability at rate 𝒪( ( 1/T)^1/d_s), where d_s ≥ 0 is the scattering dimension of the underlying function. When the observed function values are corrupted by bounded iid noise, the output of random search converges to the optimal value in probability at rate 𝒪( ( 1/T)^1/d_s + 1). In addition, based on the principles of random search, we introduce an algorithm, called BLiN-MOS, for Lipschitz bandits in doubling metric spaces that are also endowed with a Borel measure, and show that BLiN-MOS achieves a regret rate of order 𝒪( T^d_z/d_z + 1), where d_z is the zooming dimension of the problem instance. Our results show that under certain conditions, the known information-theoretical lower bounds for Lipschitz bandits Ω( T^d_z+1/d_z+2) can be improved.
READ FULL TEXT