Plato: Approximate Analytics over Compressed Time Series with Tight Deterministic Error Guarantees
Plato provides sound and tight deterministic error guarantees for approximate analytics over compressed time series. Plato supports expressions that are compositions of the (commonly used in time series analytics) linear algebra operators over vectors, along with arithmetic operators. Such analytics can express common statistics (such as correlation and cross-correlation) that may combine multiple time series. The time series are segmented either by fixed-length segmentation or by (more effective) variable-length segmentation. Each segment (i) is compressed by an estimation function that approximates the actual values and is coming from a user-chosen estimation function family, and (ii) is associated with one to three (depending on the case) precomputed error measures. Then Plato is able to provide tight deterministic error guarantees for the analytics over the compressed time series. This work identifies two broad estimation function family groups. The Vector Space (VS) family and the presently defined Linear Scalable Family (LSF) lead to theoretically and practically high-quality guarantees, even for queries that combine multiple time series that have been independently compressed. Well-known function families (e.g., the polynomial function family) belong to LSF. The theoretical aspect of "high quality" is crisply captured by the Amplitude Independence (AI) property: An AI guarantee does not depend on the amplitude of the involved time series, even when we combine multiple time series. The experiments on four real-life datasets validated the importance of the Amplitude Independent (AI) error guarantees: When the novel AI guarantees were applicable, the guarantees could ensure that the approximate query results were very close (typically 1
READ FULL TEXT