Minimax Regret for Bandit Convex Optimisation of Ridge Functions
We analyse adversarial bandit convex optimisation with an adversary that is restricted to playing functions of the form f(x) = g(⟨ x, θ⟩) for convex g : ℝ→ℝ and θ∈ℝ^d. We provide a short information-theoretic proof that the minimax regret is at most O(d√(n)log(diam𝒦)) where n is the number of interactions, d the dimension and diam(𝒦) is the diameter of the constraint set. Hence, this class of functions is at most logarithmically harder than the linear case.
READ FULL TEXT