Predicting school transition rates in Austria with classification trees
Methods based on machine learning become increasingly popular in many areas as they allow models to be fitted in a highly-data driven fashion, and often show comparable or even increased performance in comparison to classical methods. However, in the area of educational sciences the application of machine learning is still quite uncommon. This work investigates the benefit of using classification trees for analyzing data from educational sciences. An application to data on school transition rates in Austria indicates different aspects of interest in the context of educational sciences: (i) the trees select variables for predicting school transition rates in a data-driven fashion which are well in accordance with existing confirmatory theories from educational sciences, (ii) trees can be employed for performing variable selection for regression models, (iii) the classification performance of trees is comparable to that of binary regression models. These results indicate that trees and possibly other machine learning methods may also be helpful to explore high-dimensional educational data sets, especially where no confirmatory theories have been developed yet.
READ FULL TEXT