Correlating Unlabeled Events at Runtime
Process mining is of great importance for both data-centric and process-centric systems. Process mining receives so-called process logs which are collections of partially-ordered events. An event has to possess at least three attributes, case ID, task ID and a timestamp for mining approaches to work. When a case ID is unknown, the event is called unlabeled. Traditionally, process mining is an offline task, where events are collected from different sources are usually manually correlated. That is, events belonging to the same instance are assigned the same case ID. With today's high-volume/high-speed nature of, e.g., IoT applications, process mining shifts to be an online task. For this, event correlation has to be automated and has to occur as the data is generated. In this paper, we introduce an approach that correlates unlabeled events at runtime. Given a process model, a stream of unlabeled events and other information about task duration, our approach can induce a case identifier to a set of unlabeled events with a trust percentage. It can also check the conformance of the identified cases with the process model. A prototype of the proposed approach was implemented and evaluated against real-life and synthetic logs.
READ FULL TEXT