On the generalization error of norm penalty linear regression models

11/14/2022
by   Jose Luis Montiel Olea, et al.
0

We study linear regression problems inf_β∈ℝ^d(𝔼_ℙ_n[|Y - 𝐗^⊤β|^r])^1/r + δρ(β), with r≥ 1, convex penalty ρ, and empirical measure of the data ℙ_n. Well known examples include the square-root lasso, square-root sorted ℓ_1 penalization, and penalized least absolute deviations regression. We show that, under benign regularity assumptions on ρ, such procedures naturally provide robust generalization, as the problem can be reformulated as a distributionally robust optimization (DRO) problem for a type of max-sliced Wasserstein ball B_δ^ρ(ℙ_n), i.e. β solves the linear regression problem iff it solves inf_β∈ℝ^dsup_ℚ∈ B^ρ_δ(ℙ_n)𝔼_ℚ[|Y - 𝐗^⊤β|^r]. Our proof of this result is constructive: it identifies the worst-case measure in the DRO problem, which is given by an additive perturbation of ℙ_n. We argue that B_δ^ρ(ℙ_n) are the natural balls to consider in this framework, as they provide a computationally efficient procedure comparable to non-robust methods and optimal robustness guarantees. In fact, our generalization bounds are of order d/n, up to logarithmic factors, and thus do not suffer from the curse of dimensionality as is the case for known generalization bounds when using the Wasserstein metric on ℝ^d. Moreover, the bounds provide theoretical support for recommending a regularization parameter δ of the same order for the linear regression problem.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset
Success!
Error Icon An error occurred

Sign in with Google

×

Use your Google Account to sign in to DeepAI

×

Consider DeepAI Pro