Latent Association Mining in Binary Data

11/28/2017
by   Kelly Bodwin, et al.
0

We consider the problem of identifying groups of mutually associated variables in moderate or high dimensional data. In many cases, ordinary Pearson correlation provides useful information concerning the linear relationship between variables. However, for binary data, ordinary correlation may lose power and may lack interpretability. In this paper, we develop and investigate a new method called Latent Association Mining in Binary Data (LAMB). The LAMB method is built on the assumption that the binary observations represent a random thresholding of a latent continuous variable that may have a complex correlation structure. We consider a new measure of association, latent correlation, that is designed to assess association in the underlying continuous variable, without bias due to the mediating effects of the thresholding procedure. The full LAMB procedure makes use of iterative hypothesis testing to identify groups of latently correlated variables. LAMB is shown to improve power over existing methods in simulated settings, to be computationally efficient for large datasets, and to uncover new meaningful results from common real data types.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset