Mining Maximal Induced Bicliques using Odd Cycle Transversals
Many common graph data mining tasks take the form of identifying dense subgraphs (e.g. clustering, clique-finding, etc). In biological applications, the natural model for these dense substructures is often a complete bipartite graph (biclique), and the problem requires enumerating all maximal bicliques (instead of just identifying the largest or densest). The best known algorithm in general graphs is due to Dias et al., and runs in time O(M |V|^4 ), where M is the number of maximal induced bicliques (MIBs) in the graph. When the graph being searched is itself bipartite, Zhang et al. give a faster algorithm where the time per MIB depends on the number of edges in the graph. In this work, we present a new algorithm for enumerating MIBs in general graphs, whose run time depends on how "close to bipartite" the input is. Specifically, the runtime is parameterized by the size k of an odd cycle transversal (OCT), a vertex set whose deletion results in a bipartite graph. Our algorithm runs in time O(M |V||E|k^2 3^(k/3) ), which is an improvement on Dias et al. whenever k <= 3log_3(|V|). We implement our algorithm alongside a variant of Dias et al.'s in open-source C++ code, and experimentally verify that the OCT-based approach is faster in practice on graphs with a wide variety of sizes, densities, and OCT decompositions.
READ FULL TEXT