Despite advances in Visual Question Answering (VQA), the ability of mode...
The ability to judge whether a caption correctly describes an image is a...
We introduce the Segment Anything (SA) project: a new task, model, and
d...
Machine learning has advanced dramatically, narrowing the accuracy gap t...
Generalization to out-of-distribution data has been a problem for Visual...
Existing Visual Question Answering (VQA) models are often fragile and
se...
Many name tagging approaches use local contextual information with much
...
We introduce a new task, MultiMedia Event Extraction (M2E2), which aims ...
Traditional first-order logic (FOL) reasoning systems usually rely on ma...
Textual entailment is a fundamental task in natural language processing....
We describe a Maple package that serves at least four purposes. First, o...
We present a paper abstract writing system based on an attentive neural
...
Image captioning approaches currently generate descriptions which lack
s...