A Formal Analysis of the Count-Min Sketch with Conservative Updates
Count-Min Sketch with Conservative Updates (CMS-CU) is a popular algorithm to approximately count items' appearances in a data stream. Despite CMS-CU's widespread adoption, the theoretical analysis of its performance is still wanting because of its inherent difficulty. In this paper, we propose a novel approach to study CMS-CU and derive new upper bounds on the expected value and the CCDF of the estimation error under an i.i.d. request process. Our formulas can be successfully employed to derive improved estimates for the precision of heavy-hitter detection methods and improved configuration rules for CMS-CU. The bounds are evaluated both on synthetic and real traces.
READ FULL TEXT