Visual question answering is a vision-and-language multimodal task, that...
Fine-tuning pre-trained language models improves the quality of commerci...
Active visual exploration aims to assist an agent with a limited field o...
The ongoing COVID-19 pandemic calls for a multi-faceted public health
re...
In light of the recent breakthroughs in automatic machine translation
sy...
Regional language extraction from a natural scene image is always a
chal...