On the Parameterized Complexity of Polytree Learning
A Bayesian network is a directed acyclic graph that represents statistical dependencies between variables of a joint probability distribution. A fundamental task in data science is to learn a Bayesian network from observed data. Polytree Learning is the problem of learning an optimal Bayesian network that fulfills the additional property that its underlying undirected graph is a forest. In this work, we revisit the complexity of Polytree Learning. We show that Polytree Learning can be solved in 3^n · |I|^𝒪(1) time where n is the number of variables and |I| is the total instance size. Moreover, we consider the influence of the number of variables d that might receive a nonempty parent set in the final DAG on the complexity of Polytree Learning. We show that Polytree Learning has no f(d)· |I|^𝒪(1)-time algorithm, unlike Bayesian network learning which can be solved in 2^d · |I|^𝒪(1) time. We show that, in contrast, if d and the maximum parent set size are bounded, then we can obtain efficient algorithms.
READ FULL TEXT