Scalable Model Selection for Staged Trees: Mean-posterior Clustering and Binary Trees
Several structure-learning algorithms for staged trees, asymmetric extensions of Bayesian networks, have been proposed. However, these either do not scale efficiently as the number of variables considered increases, a priori restrict the set of models, or they do not find comparable models to existing methods. Here, we define an alternative algorithm based on a totally ordered hyperstage. We demonstrate how it can be used to obtain a quadratically-scaling structural learning algorithm for staged trees that restricts the model space a-posteriori. Through comparative analysis, we show that through the ordering provided by the mean posterior distributions, we can outperform existing methods in both computational time and model score. This method also enables us to learn more complex relationships than existing model selection techniques by expanding the model space and illustrates how this can embellish inferences in a real study.
READ FULL TEXT