Coupling Online-Offline Learning for Multi-distributional Data Streams

02/12/2022
by   Zhilin Zhao, et al.
0

The distributions of real-life data streams are usually nonstationary, where one exciting setting is that a stream can be decomposed into several offline intervals with a fixed time horizon but different distributions and an out-of-distribution online interval. We call such data multi-distributional data streams, on which learning an on-the-fly expert for unseen samples with a desirable generalization is demanding yet highly challenging owing to the multi-distributional streaming nature, particularly when initially limited data is available for the online interval. To address these challenges, this work introduces a novel optimization method named coupling online-offline learning (CO_2) with theoretical guarantees about the knowledge transfer, the regret, and the generalization error. CO_2 extracts knowledge by training an offline expert for each offline interval and update an online expert by an off-the-shelf online optimization method in the online interval. CO_2 outputs a hypothesis for each sample by adaptively coupling both the offline experts and the underlying online expert through an expert-tracking strategy to adapt to the dynamic environment. To study the generalization performance of the output hypothesis, we propose a general theory to analyze its excess risk bound related to the loss function properties, the hypothesis class, the data distribution, and the regret.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset