Fast computation of distance-generalized cores using sampling
Core decomposition is a classic technique for discovering densely connected regions in a graph with large range of applications. Formally, a k-core is a maximal subgraph where each vertex has at least k neighbors. A natural extension of a k-core is a (k, h)-core, where each node must have at least k nodes that can be reached with a path of length h. The downside in using (k, h)-core decomposition is the significant increase in the computational complexity: whereas the standard core decomposition can be done in O(m) time, the generalization can require O(n^2m) time, where n and m are the number of nodes and edges in the given graph. In this paper we propose a randomized algorithm that produces an ϵ-approximation of (k, h) core decomposition with a probability of 1 - δ in O(ϵ^-2 hm (log^2 n - logδ)) time. The approximation is based on sampling the neighborhoods of nodes, and we use Chernoff bound to prove the approximation guarantee. We demonstrate empirically that approximating the decomposition complements the exact computation: computing the approximation is significantly faster than computing the exact solution for the networks where computing the exact solution is slow.
READ FULL TEXT