Searching for Anomalies over Composite Hypotheses
The problem of detecting anomalies in multiple processes is considered. We consider a composite hypothesis case, in which the measurements drawn when observing a process follow a common distribution with an unknown parameter (vector), whose value lies in normal or abnormal parameter spaces, depending on its state. The objective is a sequential search strategy that minimizes the expected detection time subject to an error probability constraint. We develop a deterministic search algorithm with the following desired properties. First, when no additional side information on the process states is known, the proposed algorithm is asymptotically optimal in terms of minimizing the detection delay as the error probability approaches zero. Second, when the parameter value under the null hypothesis is known and equal for all normal processes, the proposed algorithm is asymptotically optimal as well, with better detection time determined by the true null state. Third, when the parameter value under the null hypothesis is unknown, but is known to be equal for all normal processes, the proposed algorithm is consistent in terms of achieving error probability that decays to zero with the detection delay. Finally, an explicit upper bound on the error probability under the proposed algorithm is established for the finite sample regime. Extensive experiments on synthetic dataset and DARPA intrusion detection dataset are conducted, demonstrating strong performance of the proposed algorithm over existing methods.
READ FULL TEXT