TVNet: Temporal Voting Network for Action Localization

01/02/2022
by   Hanyuan Wang, et al.
15

We propose a Temporal Voting Network (TVNet) for action localization in untrimmed videos. This incorporates a novel Voting Evidence Module to locate temporal boundaries, more accurately, where temporal contextual evidence is accumulated to predict frame-level probabilities of start and end action boundaries. Our action-independent evidence module is incorporated within a pipeline to calculate confidence scores and action classes. We achieve an average mAP of 34.6 methods with the highest IoU of 0.95. TVNet also achieves mAP of 56.0 combined with PGCN and 59.1 prior work at all thresholds. Our code is available at https://github.com/hanielwang/TVNet.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset