A Nonparametric Normality Test for High-dimensional Data

04/10/2019
by   Hao Chen, et al.
0

Many statistical methodologies for high-dimensional data assume the population normality. Although a few multivariate normality tests have been proposed, they either suffer from low power or have serious size distortion when the dimension is high. In this work, we propose a novel nonparametric test that extends from graph-based two-sample tests by utilizing the nearest neighbor information. Theoretical results guarantee the type I error control of the proposed test when the dimension is growing with the number of observations. Simulation studies verify the empirical size performance of the proposed test when the dimension is larger than the sample size and at the same time exhibit the superior power performance of the new test compared with the alternative methods. We also illustrate our approach through a popularly used lung cancer data set in high-dimensional classification literatures where deviation from the normality assumption may lead to completely invalid conclusion.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset