Local Correlation Clustering with Asymmetric Classification Errors
In the Correlation Clustering problem, we are given a complete weighted graph G with its edges labeled as "similar" and "dissimilar" by a noisy binary classifier. For a clustering 𝒞 of graph G, a similar edge is in disagreement with 𝒞, if its endpoints belong to distinct clusters; and a dissimilar edge is in disagreement with 𝒞 if its endpoints belong to the same cluster. The disagreements vector, dis, is a vector indexed by the vertices of G such that the v-th coordinate dis_v equals the weight of all disagreeing edges incident on v. The goal is to produce a clustering that minimizes the ℓ_p norm of the disagreements vector for p≥ 1. We study the ℓ_p objective in Correlation Clustering under the following assumption: Every similar edge has weight in the range of [α𝐰,𝐰] and every dissimilar edge has weight at least α𝐰 (where α≤ 1 and 𝐰>0 is a scaling parameter). We give an O((1/α)^1/2-1/2p·log1/α) approximation algorithm for this problem. Furthermore, we show an almost matching convex programming integrality gap.
READ FULL TEXT