Variable selection for varying multi-index coefficients models with applications to synergistic GxE interactions
Epidemiological evidence suggests that simultaneous exposures to multiple environmental risk factors (Es) can increase disease risk larger than the additive effect of individual exposure acting alone. The interaction between a gene and multiple Es on a disease risk is termed as synergistic gene-environment interactions (synG×E). Varying multi-index coefficients models (VMICM) have been a promising tool to model synergistic G×E effect and to understand how multiple Es jointly influence genetic risks on a disease outcome. In this work, we proposed a 3-step variable selection approach for VMICM to estimate different effects of gene variables: varying, non-zero constant and zero effects which respectively correspond to nonlinear synG×E, no synG×E and no genetic effect. For multiple environmental exposure variables, we also estimated and selected important environmental variables that contribute to the synergistic interaction effect. We theoretically evaluated the oracle property of the proposed variable selection approach. Extensive simulation studies were conducted to evaluate the finite sample performance of the method, considering both continuous and discrete gene variables. Application to a real dataset further demonstrated the utility of the method. Our method has broad applications in areas where the purpose is to identify synergistic interaction effect.
READ FULL TEXT