Graphic layout generation, a growing research field, plays a significant...
Avoiding synthesizing specific visual concepts is an essential challenge...
The rapid advancements in large language models (LLMs) have presented
ch...
Controllable video generation has gained significant attention in recent...
Code search is a task to find programming codes that semantically match ...
In this report, we present our champion solution for Ego4D Natural Langu...
In this paper, we introduce a new task for code completion that focuses ...
As the capabilities of large language models (LLMs) continue to advance,...
Two-Tower Vision-Language (VL) models have shown promising improvements ...
Large language models are powerful text processors and reasoners, but ar...
Open-domain question answering is a crucial task that often requires
acc...
Large Language Models (LLMs) play a powerful Reader of the
Retrieve-then...
There are two types of approaches to solving cross-lingual transfer:
mul...
Diffusion models have gained significant attention in the realm of image...
Based on the remarkable achievements of pre-trained language models in
a...
Code execution is a fundamental aspect of programming language semantics...
Large language models (LLMs) can achieve highly effective performance on...
Large Language Models (LLMs) have shown remarkable performance in variou...
Effectively utilizing LLMs for complex tasks is challenging, often invol...
Evaluating the general abilities of foundation models to tackle human-le...
Chat models, such as ChatGPT, have shown impressive capabilities and hav...
Many natural language processing (NLP) tasks rely on labeled data to tra...
Artificial Intelligence (AI) has made incredible progress recently. On t...
ChatGPT is attracting a cross-field interest as it provides a language
i...
3D photography renders a static image into a video with appealing 3D vis...
Recently multi-lingual pre-trained language models (PLM) such as mBERT a...
Large language models can perform various reasoning tasks by using
chain...
In this paper, we propose a large-scale language pre-training for text
G...
The dual-encoder has become the de facto architecture for dense retrieva...
Dense retrieval aims to map queries and passages into low-dimensional ve...
Long-form numerical reasoning in financial analysis aims to generate a
r...
Knowledge distillation is often used to transfer knowledge from a strong...
Developing models that can automatically generate detailed code explanat...
We introduce GENIUS: a conditional text generation model using sketches ...
Code generation models can benefit data scientists' productivity by
auto...
This technical report describes the CONE approach for Ego4D Natural Lang...
Sampling proper negatives from a large document pool is vital to effecti...
Commonsense generation aims to generate a realistic sentence describing ...
This paper presents ReasonFormer, a unified reasoning framework for mirr...
Most existing pre-trained language representation models (PLMs) are
sub-...
Code contrastive pre-training has recently achieved significant progress...
Retrieving evidences from tabular and textual resources is essential for...
Knowledge distillation is an effective way to transfer knowledge from a
...
Video temporal grounding (VTG) targets to localize temporal moments in a...
In this paper, we present NUWA-Infinity, a generative model for infinite...
Due to exposure bias, most existing natural language generation (NLG) mo...
Vision-Language (VL) models with the Two-Tower architecture have dominat...
Recent research demonstrates the effectiveness of using pretrained langu...
Recently most successful image synthesis models are multi stage process ...
Non-Autoregressive generation is a sequence generation paradigm, which
r...