pGMM Kernel Regression and Comparisons with Boosted Trees
In this work, we demonstrate the advantage of the pGMM (“powered generalized min-max”) kernel in the context of (ridge) regression. In recent prior studies, the pGMM kernel has been extensively evaluated for classification tasks, for logistic regression, support vector machines, as well as deep neural networks. In this paper, we provide an experimental study on ridge regression, to compare the pGMM kernel regression with the ordinary ridge linear regression as well as the RBF kernel ridge regression. Perhaps surprisingly, even without a tuning parameter (i.e., p=1 for the power parameter of the pGMM kernel), the pGMM kernel already performs well. Furthermore, by tuning the parameter p, this (deceptively simple) pGMM kernel even performs quite comparably to boosted trees. Boosting and boosted trees are very popular in machine learning practice. For regression tasks, typically, practitioners use L_2 boost, i.e., for minimizing the L_2 loss. Sometimes for the purpose of robustness, the L_1 boost might be a choice. In this study, we implement L_p boost for p≥ 1 and include it in the package of “Fast ABC-Boost”. Perhaps also surprisingly, the best performance (in terms of L_2 regression loss) is often attained at p>2, in some cases at p≫ 2. This phenomenon has already been demonstrated by Li et al (UAI 2010) in the context of k-nearest neighbor classification using L_p distances. In summary, the implementation of L_p boost provides practitioners the additional flexibility of tuning boosting algorithms for potentially achieving better accuracy in regression applications.
READ FULL TEXT