Approximate Identification of the Optimal Epidemic Source in Complex Networks
We consider the problem of identifying the source of an epidemic, spreading through a network, from a complete observation of the infected nodes in a snapshot of the network. Previous work on the problem has often employed geometric, spectral or heuristic approaches to identify the source, with the trees being the most studied network topology. We take a fully statistical approach and derive novel recursions to compute the Bayes optimal solution, under a susceptible-infected (SI) epidemic model. Our analysis is time and rate independent, and holds for general network topologies. We then provide two tractable algorithms for solving these recursions, a mean-field approximation and a greedy approach, and evaluate their performance on real and synthetic networks. Real networks are far from tree-like and an emphasis will be given to networks with high transitivity, such as social networks and those with communities. We show that on such networks, our approaches significantly outperform geometric and spectral centrality measures, most of which perform no better than random guessing. Both the greedy and mean-field approximation are scalable to large sparse networks.
READ FULL TEXT