Large Language Models (LLMs) have revolutionized natural language proces...
Due to the flexible representation of arbitrary-shaped scene text and si...
Outside-knowledge visual question answering is a challenging task that
r...
Neural Architectures Search (NAS) becomes more and more popular over the...
Differentiable Neural Architecture Search (DARTS) is becoming more and m...
Differentiable Architecture Search (DARTS) is a simple yet efficient Neu...
Contrastive self-supervised learning (CSL) based on instance discriminat...
Despite the excellent performance of large-scale vision-language pre-tra...
In this work, we focus on dialogue reading comprehension (DRC), a task
e...
Empathy, which is widely used in psychological counselling, is a key tra...
Recently, incremental learning for person re-identification receives
inc...
Despite the remarkable success of pre-trained language models (PLMs), th...
Visual Question Answering (VQA) models are prone to learn the shortcut
s...
Models for Visual Question Answering (VQA) often rely on the spurious
co...
In this paper, we focus our attention on private Empirical Risk Minimiza...
Recent scene text detection methods are almost based on deep learning an...
Conversational Causal Emotion Entailment aims to detect causal utterance...
Recent studies on the lottery ticket hypothesis (LTH) show that pre-trai...
In this paper, by introducing Generalized Bernstein condition, we propos...
In the field of machine learning, many problems can be formulated as the...
Texts in scene images convey critical information for scene understandin...
Real-world data often follows a long-tailed distribution, which makes th...
Self-supervised representation learning for visual pre-training has achi...
Scene text detection has drawn the close attention of researchers. Thoug...
Due to the large success in object detection and instance segmentation, ...
Most of the existing video self-supervised methods mainly leverage tempo...
In real applications, new object classes often emerge after the detectio...
Hierarchical classification is significant for complex tasks by providin...
Recently, knowledge distillation (KD) has shown great success in BERT
co...
While sophisticated Visual Question Answering models have achieved remar...
Despite the great progress achieved in unsupervised feature embedding,
e...
Pairwise learning focuses on learning tasks with pairwise loss functions...
Deep hashing methods have shown great retrieval accuracy and efficiency ...
In recent years a vast amount of visual content has been generated and s...
Emotion Recognition in Conversation (ERC) is a more challenging task tha...
Zero-shot intent detection (ZSID) aims to deal with the continuously eme...
Scene text recognition has been a hot topic in computer vision. Recent
m...
The focus of this survey is on the analysis of two modalities of multimo...
In this paper, we consider the problem of fine-grained image retrieval i...
Existing video self-supervised learning methods mainly rely on trimmed v...
Modern object detection methods based on convolutional neural network su...
Deep neural networks are highly effective when a large number of labeled...
Recent scene text detection works mainly focus on curve text detection.
...
In unsupervised feature learning, sample specificity based methods ignor...
Due to their high computational efficiency on a continuous space, gradie...
Though deep learning based scene text detection has achieved great progr...
Scene text recognition is a hot research topic in computer vision. Recen...
Fractional repetition (FR) codes are a class of repair efficient erasure...
Recently, generative methods have been widely used in keyphrase predicti...
In this paper, we study the statistical properties of the kernel k-means...