The recent advances in NLP, have led to a new trend of applying LLMs to
...
The duality of content and style is inherent to the nature of art. For
h...
Image captioning models are known to perpetuate and amplify harmful soci...
The increasing tendency to collect large and uncurated datasets to train...
Is more data always better to train vision-and-language models? We study...
Vision-and-language tasks have increasingly drawn more attention as a me...
We study societal bias amplification in image captioning. Image captioni...
This work introduces a dataset for large-scale instance-level recognitio...
Video question answering (VideoQA) is designed to answer a given questio...
Have you ever looked at a painting and wondered what is the story behind...
How far can we go with textual representations for understanding picture...
The rise of digitization of cultural documents offers large-scale conten...
Visual Question Answering (VQA) is of tremendous interest to the researc...
Computational art analysis has, through its reliance on classification t...
Answering questions related to art pieces (paintings) is a difficult tas...
To understand movies, humans constantly reason over the dialogues and ac...
We propose a novel video understanding task by fusing knowledge-based an...
We propose a novel video understanding task by fusing knowledge-based an...
In computer vision, visual arts are often studied from a purely aestheti...
Automatic art analysis aims to classify and retrieve artistic representa...
Automatic art analysis has been mostly focused on classifying artworks i...
This work proposes a system for retrieving clothing and fashion products...
Can a neural network learn the concept of visual similarity? In this wor...