Action recognition is an important problem that requires identifying act...
Self-supervised video representation learning has been shown to effectiv...
The recent success of deep learning applications has coincided with thos...
Group Activity Recognition (GAR) detects the activity performed by a gro...
This paper considers the problem of spatiotemporal object-centric reason...
Pose tracking is an important problem that requires identifying unique h...
In this paper, we introduce a contextual grounding approach that capture...
Existing visual reasoning datasets such as Visual Question Answering (VQ...
We introduce a new inference task - Visual Entailment (VE) - which diffe...