Maximum Coverage in the Data Stream Model: Parameterized and Generalized
We present algorithms for the Max-Cover and Max-Unique-Cover problems in the data stream model. The input to both problems are m subsets of a universe of size n and a value k∈ [m]. In Max-Cover, the problem is to find a collection of at most k sets such that the number of elements covered by at least one set is maximized. In Max-Unique-Cover, the problem is to find a collection of at most k sets such that the number of elements covered by exactly one set is maximized. Our goal is to design single-pass algorithms that use space that is sublinear in the input size. Our main algorithmic results are: If the sets have size at most d, there exist single-pass algorithms using Õ(d^d+1 k^d) space that solve both problems exactly. This is optimal up to polylogarithmic factors for constant d. If each element appears in at most r sets, we present single pass algorithms using Õ(k^2 r/ϵ^3) space that return a 1+ϵ approximation in the case of Max-Cover. We also present a single-pass algorithm using slightly more memory, i.e., Õ(k^3 r/ϵ^4) space, that 1+ϵ approximates Max-Unique-Cover. In contrast to the above results, when d and r are arbitrary, any constant pass 1+ϵ approximation algorithm for either problem requires Ω(ϵ^-2m) space but a single pass O(ϵ^-2mk) space algorithm exists. In fact any constant-pass algorithm with an approximation better than e/(e-1) and e^1-1/k for Max-Cover and Max-Unique-Cover respectively requires Ω(m/k^2) space when d and r are unrestricted. En route, we also obtain an algorithm for a parameterized version of the streaming Set-Cover problem.
READ FULL TEXT