Scaling Up Maximal k-plex Enumeration
Finding all maximal k-plexes on networks is a fundamental research problem in graph analysis due to many important applications, such as community detection, biological graph analysis, and so on. A k-plex is a subgraph in which every vertex is adjacent to all but at most k vertices within the subgraph. In this paper, we study the problem of enumerating all large maximal k-plexes of a graph and develop several new and efficient techniques to solve the problem. Specifically, we first propose several novel upper-bounding techniques to prune unnecessary computations during the enumeration procedure. We show that the proposed upper bounds can be computed in linear time. Then, we develop a new branch-and-bound algorithm with a carefully-designed pivot re-selection strategy to enumerate all k-plexes, which outputs all k-plexes in O(n^2γ_k^n) time theoretically, where n is the number of vertices of the graph and γ_k is strictly smaller than 2. In addition, a parallel version of the proposed algorithm is further developed to scale up to process large real-world graphs. Finally, extensive experimental results show that the proposed sequential algorithm can achieve up to 2× to 100× speedup over the state-of-the-art sequential algorithms on most benchmark graphs. The results also demonstrate the high scalability of the proposed parallel algorithm. For example, on a large real-world graph with more than 200 million edges, our parallel algorithm can finish the computation within two minutes, while the state-of-the-art parallel algorithm cannot terminate within 24 hours.
READ FULL TEXT