The Mismatch Principle: Statistical Learning Under Large Model Uncertainties

08/20/2018
by   Martin Genzel, et al.
0

We study the learning capacity of empirical risk minimization with regard to the squared loss and a convex hypothesis class consisting of linear functions. While these types of estimators were originally designed for noisy linear regression problems, it recently turned out that they are in fact capable of handling considerably more complicated situations, involving highly non-linear distortions. This work intends to provide a comprehensive explanation of this somewhat astonishing phenomenon. At the heart of our analysis stands the mismatch principle, which is a simple, yet generic recipe to establish theoretical error bounds for empirical risk minimization. The scope of our results is fairly general, permitting arbitrary sub-Gaussian input-output pairs, possibly with strongly correlated feature variables. Noteworthy, the mismatch principle also generalizes to a certain extent the classical orthogonality principle for ordinary least squares. This adaption allows us to investigate problem setups of recent interest, most importantly, high-dimensional parameter regimes and non-linear observation processes. In particular, our theoretical framework is applied to various scenarios of practical relevance, such as single-index models, variable selection, and strongly correlated designs. We thereby demonstrate the key purpose of the mismatch principle, that is, learning (semi-)parametric output rules under large model uncertainties and misspecifications.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset