Robust estimation of a regression function in exponential families

11/03/2020
by   Yannick Baraud, et al.
0

We observe n pairs X_1=(W_1,Y_1),…,X_n=(W_n,Y_n) of independent random variables and assume, although this might not be true, that for each i∈{1,…,n}, the conditional distribution of Y_i given W_i belongs to a given exponential family with real parameter θ_i^⋆=θ^⋆(W_i) the value of which is a function θ^⋆ of the covariate W_i. Given a model Θ for θ^⋆, we propose an estimator θ with values in Θ the construction of which is independent of the distribution of the W_i and that possesses the properties of being robust to contamination, outliers and model misspecification. We establish non-asymptotic exponential inequalities for the upper deviations of a Hellinger-type distance between the true distribution of the data and the estimated one based on θ. Under a suitable parametrization of the exponential family, we deduce a uniform risk bound for θ over the class of Hölderian functions and we prove the optimality of this bound up to a logarithmic factor. Finally, we provide an algorithm for calculating θ when θ^⋆ is assumed to belong to functional classes of low or medium dimensions (in a suitable sense) and, on a simulation study, we compare the performance of θ to that of the MLE and median-based estimators. The proof of our main result relies on an upper bound, with explicit numerical constants, on the expectation of the supremum of an empirical process over a VC-subgraph class. This bound can be of independent interest.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset
Success!
Error Icon An error occurred

Sign in with Google

×

Use your Google Account to sign in to DeepAI

×

Consider DeepAI Pro