Convergence guarantee for the sparse monotone single index model
We consider a high-dimensional monotone single index model (hdSIM), which is a semiparametric extension of a high-dimensional generalize linear model (hdGLM), where the link function is unknown, but constrained with monotone and non-decreasing shape. We develop a scalable projection-based iterative approach, the "Sparse Orthogonal Descent Single-Index Model" (SOD-SIM), which alternates between sparse-thresholded orthogonalized "gradient-like" steps and isotonic regression steps to recover the coefficient vector. Our main contribution is that we provide finite sample estimation bounds for both the coefficient vector and the link function in high-dimensional settings under very mild assumptions on the design matrix 𝐗, the error term ϵ, and their dependence. The convergence rate for the link function matched the low-dimensional isotonic regression minimax rate up to some poly-log terms (n^-1/3). The convergence rate for the coefficients is also n^-1/3 up to some poly-log terms. This method can be applied to many real data problems, including GLMs with misspecified link, classification with mislabeled data, and classification with positive-unlabeled (PU) data. We study the performance of this method via both numerical studies and also an application on a rocker protein sequence data.
READ FULL TEXT