Variable Selection in GLM and Cox Models with Second-Generation P-Values
Variable selection has become a pivotal choice in data analyses that impacts subsequent inference and prediction. In linear models, variable selection using Second-Generation P-Values (SGPV) has been shown to be as good as any other algorithm available to researchers. Here we extend the idea of Penalized Regression with Second-Generation P-Values (ProSGPV) to the generalized linear model (GLM) and Cox regression settings. The proposed ProSGPV extension is largely free of tuning parameters, adaptable to various regularization schemes and null bound specifications, and is computationally fast. Like in the linear case, it excels in support recovery and parameter estimation while maintaining strong prediction performance. The algorithm also preforms as well as its competitors in the high dimensional setting (n>p). Slight modifications of the algorithm improve its performance when data are highly correlated or when signals are dense. This work significantly strengthens the case for the ProSGPV approach to variable selection.
READ FULL TEXT