Contextually learnt detection of unusual motion-based behaviour in crowded public spaces
In this paper we are interested in analyzing behaviour in crowded public places at the level of holistic motion. Our aim is to learn, without user input, strong scene priors or labelled data, the scope of "normal behaviour" for a particular scene and thus alert to novelty in unseen footage. The first contribution is a low-level motion model based on what we term tracklet primitives, which are scene-specific elementary motions. We propose a clustering-based algorithm for tracklet estimation from local approximations to tracks of appearance features. This is followed by two methods for motion novelty inference from tracklet primitives: (a) we describe an approach based on a non-hierarchial ensemble of Markov chains as a means of capturing behavioural characteristics at different scales, and (b) a more flexible alternative which exhibits a higher generalizing power by accounting for constraints introduced by intentionality and goal-oriented planning of human motion in a particular scene. Evaluated on a 2h long video of a busy city marketplace, both algorithms are shown to be successful at inferring unusual behaviour, the latter model achieving better performance for novelties at a larger spatial scale.
READ FULL TEXT