Scale matrix estimation under data-based loss in high and low dimensions
We consider the problem of estimating the scale matrix Σ of the additif model Y_p× n = M + ℰ, under a theoretical decision point of view. Here, p is the number of variables, n is the number of observations, M is a matrix of unknown parameters with rank q<p and ℰ is a random noise, whose distribution is elliptically symmetric with covariance matrix proportional to I_n ⊗Σ . We deal with a canonical form of this model where Y is decomposed in two matrices, namely, Z_q× p which summarizes the information contained in M, and U_m× p, where m=n-q, which summarizes the sufficient information to estimate Σ. As the natural estimators of the form Σ̂_a=a S (where S=U^T U and a is a positive constant) perform poorly when p >m (S non-invertible), we propose estimators of the form Σ̂_a, G = a( S+ S S^+ G(Z,S)) where S^+ is the Moore-Penrose inverse of S (which coincides with S^-1 when S is invertible). We provide conditions on the correction matrix SS^+G(Z,S) such that Σ̂_a, G improves over Σ̂_a under the data-based loss L _S( Σ, Σ̂) = tr ( S^+Σ (Σ̂ Σ ^ - 1 - I_ p )^ 2). We adopt a unified approach of the two cases where S is invertible (p ≤ m) and S is non-invertible (p>m).
READ FULL TEXT