Graph model selection by edge probability sequential inference
Graphs are widely used for describing systems made up of many interacting components and for understanding the structure of their interactions. Various statistical models exist, which describe this structure as the result of a combination of constraints and randomness. automatically identify the best model, and the best set of parameters for a given graph. To do so, most authors rely on the minimum description length paradigm, and apply it to graphs by considering the entropy of probability distributions defined on graph ensembles. In this paper, we introduce edge probability sequential inference, a new approach to perform model selection, which relies on probability distributions on edge ensembles. From a theoretical point of view, we show that this methodology provides a more consistent ground for statistical inference with respect to existing techniques, due to the fact that it relies on multiple realizations of the random variable. It also provides better guarantees against overfitting, by making it possible to lower the number of parameters of the model below the number of observations. Experimentally, we illustrate the benefits of this methodology in two situations: to infer the partition of a stochastic blockmodel, and to identify the most relevant model for a given graph between the stochastic blockmodel and the configuration model.
READ FULL TEXT