Robust Online Multi-target Visual Tracking using a HISP Filter with Discriminative Deep Appearance Learning
We propose a novel online multi-target visual tracker based on the recently developed Hypothesized and Independent Stochastic Population (HISP) filter. The HISP filter combines advantages of traditional tracking approaches like multiple hypothesis tracking (MHT) and point-process-based approaches like probability hypothesis density (PHD) filter, and has a linear complexity while maintaining track identities. We apply this filter for tracking multiple targets in video sequences acquired under varying environmental conditions and targets density using a tracking-by-detection approach. We also adopt deep convolutional neural networks (CNN) appearance representation by training a verification-identification network (VerIdNet) on large-scale person re-identification data sets. We construct an augmented likelihood in a principled manner using this deep CNN appearance features and spatio-temporal (motion) information that can improve the tracker's performance. In addition, we solve the problem of two or more targets having identical label taking into account the weight propagated with each confirmed hypothesis. Finally, we carry out extensive experiments on Multiple Object Tracking 2016 (MOT16) and 2017 (MOT17) benchmark data sets and find out that our tracker significantly outperforms several state-of-the-art trackers in terms of tracking accuracy.
READ FULL TEXT