This paper explores the effectiveness of model-generated signals in impr...
Language models have steadily increased in size over the past few years....
A big convergence of model architectures across language, vision, speech...
Sparse mixture of experts provides larger model capacity while requiring...
We present an efficient method of pretraining large-scale autoencoding
l...
We present a new framework AMOS that pretrains text encoders with an
Adv...
In this paper, we introduce ELECTRA-style tasks to cross-lingual languag...
We consider the problem of scaling automated suggested replies for Outlo...
We present COCO-LM, a new self-supervised learning framework that pretra...
Learning vector embeddings of complex networks is a powerful approach us...
Obtaining enough labeled data to robustly train complex discriminative m...