Large language models are trained in two stages: (1) unsupervised pretra...
We present a theory for the previously unexplained divergent behavior no...
We perform an effective-theory analysis of forward-backward signal
propa...
Generative language models define distributions over sequences of tokens...
Large language models, which are often trained for hundreds of thousands...
Understanding how knowledge about the world is represented within model-...
The cost to train machine learning models has been increasing exponentia...
On April 13th, 2019, OpenAI Five became the first AI system to defeat th...