Existing emotion prediction benchmarks contain coarse emotion labels whi...
Compositional reasoning is a hallmark of human visual intelligence; yet
...
We propose a self-supervised approach for learning to perform audio sour...
Explainability is one of the key elements for building trust in AI syste...
While models for Visual Question Answering (VQA) have steadily improved ...
While there have been many proposals on how to make AI algorithms more
t...
In this paper, we present a novel approach for the task of eXplainable
Q...
Visual Question Answering (VQA) is the task of answering natural-languag...