Large vision-language models (LVLMs) have recently witnessed rapid
advan...
We introduce the Qwen-VL series, a set of large-scale vision-language mo...
Foundation language models obtain the instruction-following ability thro...
The answering quality of an aligned large language model (LLM) can be
dr...
In this work, we explore a scalable way for building a general represent...
This paper proposes a new method, OFA-OCR, to transfer multimodal pretra...
Generalist models, which are capable of performing diverse multi-modal t...
The tremendous success of CLIP (Radford et al., 2021) has promoted the
r...
Prompt tuning has become a new paradigm for model tuning and it has
demo...
Prompt Learning has recently gained great popularity in bridging the gap...
Despite the remarkable success of deep multi-modal learning in practice,...
In this work, we pursue a unified paradigm for multimodal pretraining to...
Many existing neural architecture search (NAS) solutions rely on downstr...
Recent expeditious developments in deep learning algorithms, distributed...
Mixture-of-Experts (MoE) models can achieve promising results with outra...
Vehicle search is one basic task for the efficient traffic management in...
Table-to-text generation refers to generating a descriptive text from a
...
Despite the achievements of large-scale multimodal pre-training approach...
Text-to-Image generation in the general domain has long been an open pro...
In this work, we construct the largest dataset for multimodal pretrainin...
Long text generation is an important but challenging task.The main probl...
Multi-modal pretraining for learning high-level multi-modal representati...
Self-attention based Transformer has demonstrated the state-of-the-art
p...
Layer normalization (LayerNorm) is a technique to normalize the distribu...
In this paper, we propose a novel end-to-end framework called KBRD, whic...
Non-autoregressive translation models (NAT) have achieved impressive
inf...
Quality product descriptions are critical for providing competitive cust...
Multi-label text classification (MLTC) aims to assign multiple labels to...
We propose a novel model for Neural Machine Translation (NMT). Different...
Generating semantically coherent responses is still a major challenge in...
We propose a novel model for multi-label text classification, which is b...
Most of the Neural Machine Translation (NMT) models are based on the
seq...
A great proportion of sequence-to-sequence (Seq2Seq) models for Neural
M...
A sentence can be translated into more than one correct sentences. Howev...
Most of the current abstractive text summarization models are based on t...
In neural abstractive summarization, the conventional sequence-to-sequen...
Text summarization and sentiment classification both aim to capture the ...
Attention-based sequence-to-sequence model has proved successful in Neur...
Existing text generation methods tend to produce repeated and "boring"
e...