A Simple Correction Procedure for High-Dimensional Generalized Linear Models with Measurement Error
We consider high-dimensional generalized linear models when the covariates are contaminated by measurement error. Estimates from errors-in-variables regression models are well-known to be biased in traditional low-dimensional settings if the error is unincorporated. Such models have recently become of interest when regularizing penalties are added to the estimation procedure. Unfortunately, correcting for the mismeasurements can add undue computational difficulties onto the optimization, which a new tool set for practitioners to successfully use the models. We investigate a general procedure that utilizes the recently proposed Imputation-Regularized Optimization algorithm for high-dimensional errors-in-variables models, which we implement for continuous, binary, and count response type. Crucially, our method allows for off-the-shelf linear regression methods to be employed in the presence of contaminated covariates. We apply our correction to gene microarray data, and illustrate that it results in a great reduction in the number of false positives whilst still retaining most true positives.
READ FULL TEXT