Reward design is a fundamental, yet challenging aspect of practical
rein...
Transformer models have achieved remarkable results in various natural
l...
Large language models are powerful text processors and reasoners, but ar...
Generalization to unseen tasks is an important ability for few-shot lear...
In-Context learning is the paradigm that adapts large language models to...
Diffusion models have gained significant attention in the realm of image...
Code execution is a fundamental aspect of programming language semantics...
We present Prompt Diffusion, a framework for enabling in-context learnin...
Diffusion models are powerful, but they require a lot of time and data t...
Evaluating the general abilities of foundation models to tackle human-le...
Many natural language processing (NLP) tasks rely on labeled data to tra...
The task of repository-level code completion is to continue writing the
...
Fine-tuning large pre-trained language models on downstream tasks has be...
Most language models (LMs) are trained and applied in an autoregressive
...
Large language models (LLMs), such as ChatGPT, are able to generate
huma...
Large language models can perform various reasoning tasks by using
chain...
In this paper, we propose a large-scale language pre-training for text
G...
Pre-trained language models have achieved promising success in code retr...
Fine-tuning large language models for different tasks can be costly and
...
We introduce GENIUS: a conditional text generation model using sketches ...
Sampling proper negatives from a large document pool is vital to effecti...
Code contrastive pre-training has recently achieved significant progress...
Layer-wise distillation is a powerful tool to compress large models (i.e...
The task of generating code solutions for a given programming problem ca...
The information in tables can be an important complement to text, making...
Due to exposure bias, most existing natural language generation (NLG) mo...
Large Transformer-based models have exhibited superior performance in va...
Code generation is a longstanding challenge, aiming to generate a code
s...
Large language models such as GPT-3 and PaLM have shown remarkable
perfo...
For stable training of generative adversarial networks (GANs), injecting...
Non-Autoregressive generation is a sequence generation paradigm, which
r...
Active learning, which effectively collects informative unlabeled data f...
Dialog response generation in open domain is an important research topic...
Pre-trained language models have demonstrated superior performance in va...
Model ensemble is a popular approach to produce a low-variance and
well-...
Hyperparameter (HP) tuning in deep learning is an expensive process,
pro...
Recently the prompt-tuning paradigm has attracted significant attention....
To guide the generation of large pretrained language models (LM), previo...
Employing a forward Markov diffusion chain to gradually map the data to ...
Token-mixing multi-layer perceptron (MLP) models have shown competitive
...
Recent research has shown the existence of significant redundancy in lar...
Reasoning over natural language is a long-standing goal for the research...
In this paper, we propose the CodeRetriever model, which combines the
un...
Virtual support agents have grown in popularity as a way for businesses ...
This paper presents a new pre-trained language model, DeBERTaV3, which
i...
Gigantic pre-trained models have become central to natural language
proc...
Large pretrained vision-language (VL) models can learn a new task with a...
Current dense text retrieval models face two typical challenges. First, ...
Cross-lingual pre-training has achieved great successes using monolingua...
Adversarial regularization can improve model generalization in many natu...