We present a novel data-efficient semi-supervised framework to improve t...
Videos are a popular media form, where online video streaming has recent...
Constructing a large-scale labeled dataset in the real world, especially...
A common problem in the task of human-object interaction (HOI) detection...
In this work, we address the issues of missing modalities that have aris...
We introduce dense relational captioning, a novel image captioning task ...
A common problem in human-object interaction (HOI) detection task is tha...
Video stabilization is a fundamental and important technique for higher
...
Constructing an organized dataset comprised of a large number of images ...
Our goal in this work is to train an image captioning model that generat...
Human behavior understanding is arguably one of the most important mid-l...
The best summary of a long video differs among different people due to i...
Photo collections and its applications today attempt to reflect user
int...