Model family selection for classification using Neural Decision Trees
Model selection consists in comparing several candidate models according to a metric to be optimized. The process often involves a grid search, or such, and cross-validation, which can be time consuming, as well as not providing much information about the dataset itself. In this paper we propose a method to reduce the scope of exploration needed for the task. The idea is to quantify how much it would be necessary to depart from trained instances of a given family, reference models (RMs) carrying `rigid' decision boundaries (e.g. decision trees), so as to obtain an equivalent or better model. In our approach, this is realized by progressively relaxing the decision boundaries of the initial decision trees (the RMs) as long as this is beneficial in terms of performance measured on an analyzed dataset. More specifically, this relaxation is performed by making use of a neural decision tree, which is a neural network built from DTs. The final model produced by our method carries non-linear decision boundaries. Measuring the performance of the final model, and its agreement to its seeding RM can help the user to figure out on which family of models he should focus on.
READ FULL TEXT