Egocentric Hand-object Interaction Detection and Application

09/29/2021
by   Yao Lu, et al.
0

In this paper, we present a method to detect the hand-object interaction from an egocentric perspective. In contrast to massive data-driven discriminator based method like <cit.>, we propose a novel workflow that utilises the cues of hand and object. Specifically, we train networks predicting hand pose, hand mask and in-hand object mask to jointly predict the hand-object interaction status. We compare our method with the most recent work from Shan et al. <cit.> on selected images from EPIC-KITCHENS <cit.> dataset and achieve 89% accuracy on HOI (hand-object interaction) detection which is comparative to Shan's (92%). However, for real-time performance, with the same machine, our method can run over 30 FPS which is much efficient than Shan's (1∼2 FPS). Furthermore, with our approach, we are able to segment script-less activities from where we extract the frames with the HOI status detection. We achieve 68.2% and 82.8% F1 score on GTEA <cit.> and the UTGrasp <cit.> dataset respectively which are all comparative to the SOTA methods.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset