Sparse dominating sets and balanced neighborhood partitioning

12/21/2021
by   Yosuke Mizutani, et al.
0

Recent work in metagenomics constructs a partition of the assembly graph using an r-dominating set to enable scalable data representation and rapid approximate queries. In this paper, we consider two problems that arise in this setting: selection of a dominating set that minimizes uncertainty in partitioning, and reducing the amount of variation in piece sizes to improve scalability. First, we introduce a notion of "sparse" dominating sets which minimize the number of vertices with multiple closest dominators as measured using a new "congestion" parameter. Although identifying the least congested dominating set is NP-hard, we present an algorithm that finds one with approximately minimum congestion. In the second setting, we consider the problem of "balanced neighborhood partitioning": given an r-dominating set, find the partition which assigns each vertex to one of its closest dominators and achieves the "most balanced" piece sizes. We consider the variant which minimizes the variance of piece sizes, and show that it is NP-hard iff r is greater than 1. We design and analyze several algorithms, including a polynomial-time approach which is exact when r=1 (and heuristic otherwise). We complement our theoretical results with extensive computational experiments on a corpus of real-world networks showing that sparse dominating sets lead to more balanced neighborhood partitionings. Further, on the metagenome HuSB1, our approach maintains high neighborhood query containment and similarity while improving piece size variance.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset