Optimal LP Rounding and Fast Combinatorial Algorithms for Clustering Edge-Colored Hypergraphs
We study the approximability of a recently introduced framework for clustering edge-colored hypergraphs, where goal is to color nodes in a way that minimizes the number of hyperedges containing a node with a different color than the hyperedge. This problem is closely related to chromatic correlation clustering and various generalized multiway cut problems. We first of all provide a min{2 - 2/k, 2-2/(r+1)}-approximation by rounding a natural linear programming relaxation, where r is the maximum hyperedge size and k is the number of colors. This improves on the best previous rounding scheme that achieves an approximation of min{2 - 1/k, 2-1/(r+1)}. We show our rounding scheme is optimal by proving a matching integrality gap. When r is large, our approximation matches a known 2(1-1/k)-approximation based on reducing to node-weighted multiway cut and rounding a different linear program. The exact relationship between the two linear programs was not previously known; we prove that the canonical relaxation is always at least as tight as the node-weighted multiway cut relaxation, and can be strictly tighter. We also show that when r and k are arbitrary, the edge-colored clustering objective is approximation equivalent to vertex cover. Explicitly reducing to a graph and applying existing vertex cover algorithms leads to runtimes that are worse than linear in terms of the hypergraph size. We improve on this by designing a linear-time 2-approximation algorithm that implicitly approximates the underlying vertex cover problem without explicitly iterating through all edges in the reduced graph.
READ FULL TEXT