Summarizing a video requires a diverse understanding of the video, rangi...
Temporal action detection aims to predict the time intervals and the cla...
Data augmentation has recently emerged as an essential component of mode...
Recent self-supervised video representation learning methods focus on
ma...