There have been a lot of interest in the scaling properties of Transform...
We introduce ART, a new corpus-level autoencoding approach for training ...
Learning multimodal representations involves discovering correspondences...
Text generation is of great importance to many natural language processi...
We present a memory-augmented approach to condition an autoregressive
la...
When training and evaluating machine learning models on a large number o...
Transformer architectures have achieved state-of-the-art results on a va...
There remain many open questions pertaining to the scaling behaviour of
...
We present an end-to-end differentiable training method for
retrieval-au...
Transformers have outperformed recurrent neural networks (RNNs) in natur...
Transformers are state-of-the-art models for a variety of sequence model...
We present a language model that combines a large parametric neural netw...
Our world is open-ended, non-stationary and constantly evolving; thus wh...
Textual representation learners trained on large amounts of data have
ac...
We review motivations, definition, approaches, and methodology for
unsup...
We present a generative model for multitask conditional language generat...
Recent improvements in large-scale language models have driven progress ...
State-of-the-art unsupervised multilingual models (e.g., multilingual BE...
We show state-of-the-art word representation learning methods maximize a...
Neural networks are part of many contemporary NLP systems, yet their
emp...
We introduce a lifelong language learning setup where a model needs to l...
We define general linguistic intelligence as the ability to reuse previo...
We present a new theoretical perspective of data noising in recurrent ne...
We introduce a neural network that represents sentences by composing the...
Solving algebraic word problems requires executing a series of arithmeti...
We empirically characterize the performance of discriminative and genera...
We use reinforcement learning to learn tree-structured neural networks f...
We show that an end-to-end deep learning approach can be used to recogni...
When applying machine learning to problems in NLP, there are many choice...
We consider the scenario where the parameters of a probabilistic model a...