Differentially Private Continual Releases of Streaming Frequency Moment Estimations
The streaming model of computation is a popular approach for working with large-scale data. In this setting, there is a stream of items and the goal is to compute the desired quantities (usually data statistics) while making a single pass through the stream and using as little space as possible. Motivated by the importance of data privacy, we develop differentially private streaming algorithms under the continual release setting, where the union of outputs of the algorithm at every timestamp must be differentially private. Specifically, we study the fundamental ℓ_p (p∈ [0,+∞)) frequency moment estimation problem under this setting, and give an ε-DP algorithm that achieves (1+η)-relative approximation (∀η∈(0,1)) with polylog(Tn) additive error and uses polylog(Tn)·max(1, n^1-2/p) space, where T is the length of the stream and n is the size of the universe of elements. Our space is near optimal up to poly-logarithmic factors even in the non-private setting. To obtain our results, we first reduce several primitives under the differentially private continual release model, such as counting distinct elements, heavy hitters and counting low frequency elements, to the simpler, counting/summing problems in the same setting. Based on these primitives, we develop a differentially private continual release level set estimation approach to address the ℓ_p frequency moment estimation problem. We also provide a simple extension of our results to the harder sliding window model, where the statistics must be maintained over the past W data items.
READ FULL TEXT