Video Question Answering (VideoQA) is a challenging task that entails co...
The goal of this paper is to synthesise talking faces with controllable
...
The goal of this work is to detect new spoken terms defined by users. Wh...
Human-Object Interaction detection is a holistic visual recognition task...
Graph neural networks (GNNs) have significantly improved the representat...