Quentin Anthony

DeepAI

AI Chat AI Image Generator AI Video AI Music Voice Chat AI Photo Editor Math AI

Featured Co-authors

Edward Raff
74 publications
Irina Rish
51 publications
Eugene Belilovsky
40 publications
Yuxiong He
34 publications
Jason Phang
33 publications
Stella Biderman
26 publications
Timothée Lesort
23 publications
Horace He
11 publications
Ammar Ahmad Awan
10 publications
Kshitij Gupta
10 publications
Hari Subramoni
9 publications

research

∙ 08/08/2023

Continual Pre-Training of Large Language Models: How to (re)warm your model?

Large language models (LLMs) are routinely pre-trained on billions of to...

0 Kshitij Gupta, et al. ∙

research

∙ 04/21/2023

Emergent and Predictable Memorization in Large Language Models

Memorization, or the tendency of large language models (LLMs) to output ...

0 Stella Biderman, et al. ∙

research

∙ 04/03/2023

Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling

How do large language models (LLMs) develop and evolve over the course o...

0 Stella Biderman, et al. ∙

research

∙ 03/15/2023

MCR-DL: Mix-and-Match Communication Runtime for Deep Learning

In recent years, the training requirements of many state-of-the-art Deep...

0 Quentin Anthony, et al. ∙

research

∙ 04/14/2022

GPT-NeoX-20B: An Open-Source Autoregressive Language Model

We introduce GPT-NeoX-20B, a 20 billion parameter autoregressive languag...

18 Sid Black, et al. ∙

research

∙ 09/17/2021

Cross-layer Visualization and Profiling of Network and I/O Communication for HPC Clusters

Understanding and visualizing the full-stack performance trade-offs and ...

0 Pouya Kousha, et al. ∙

research

∙ 11/12/2019

HyPar-Flow: Exploiting MPI and Keras for Scalable Hybrid-Parallel DNN Training using TensorFlow

The enormous amount of data and computation required to train DNNs have ...

0 Ammar Ahmad Awan, et al. ∙

Quentin Anthony

Featured Co-authors

Continual Pre-Training of Large Language Models: How to (re)warm your model?

Emergent and Predictable Memorization in Large Language Models

Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling

MCR-DL: Mix-and-Match Communication Runtime for Deep Learning

GPT-NeoX-20B: An Open-Source Autoregressive Language Model

Cross-layer Visualization and Profiling of Network and I/O Communication for HPC Clusters

HyPar-Flow: Exploiting MPI and Keras for Scalable Hybrid-Parallel DNN Training using TensorFlow

Sign in with Google

Consider DeepAI Pro