Document pre-trained models and grid-based models have proven to be very...
The diversity in length constitutes a significant characteristic of text...
Visual information extraction (VIE) plays an important role in Document
...
Table structure recognition (TSR) aims at extracting tables in images in...
Multi-modal reasoning in visual question answering (VQA) has witnessed r...
Multi-modal document pre-trained models have proven to be very effective...
Image paragraph captioning aims to describe a given image with a sequenc...
Concept learning constructs visual representations that are connected to...
Video quality assessment (VQA) remains an important and challenging prob...
Semantic information has been proved effective in scene text recognition...
Aspect-based sentiment analysis (ABSA) is an emerging fine-grained senti...
With the availability of high dimensional genetic biomarkers, it is of
i...
One of the most commonly performed manipulation in a human's daily life ...
This paper targets the task of language-based moment localization. The
l...
Fine-grained visual recognition is challenging because it highly relies ...
Despite the great potential of using the low-rank matrix recovery (LRMR)...