b'Carlo Luschi'

research

∙ 03/20/2023

Unit Scaling: Out-of-the-Box Low-Precision Training

We present unit scaling, a paradigm for designing deep learning models t...

0 Charlie Blake, et al. ∙

research

∙ 11/22/2022

BESS: Balanced Entity Sampling and Sharing for Large-Scale Knowledge Graph Completion

We present the award-winning submission to the WikiKG90Mv2 track of OGB-...

0 Alberto Cattaneo, et al. ∙

research

∙ 06/06/2022

8-bit Numerical Formats for Deep Neural Networks

Given the current trend of increasing size and complexity of machine lea...

56 Badreddine Noune, et al. ∙

research

∙ 08/13/2021

Towards Structured Dynamic Sparse Pre-Training of BERT

Identifying algorithms for computational efficient unsupervised training...

4 Anastasia Dietrich, et al. ∙

research

∙ 06/10/2021

GroupBERT: Enhanced Transformer Architecture with Efficient Grouped Structures

Attention based language models have become a critical component in stat...

5 Ivan Chelombiev, et al. ∙

research

∙ 06/07/2021

Proxy-Normalizing Activations to Match Batch Normalization while Removing Batch Dependence

We investigate the reasons for the performance degradation incurred with...

0 Antoine Labatie, et al. ∙

research

∙ 06/07/2021

Making EfficientNet More Efficient: Exploring Batch-Independent Normalization, Group Convolutions and Reduced Resolution Training

Much recent research has been dedicated to improving the efficiency of t...

0 Dominic Masters, et al. ∙

research

∙ 12/07/2020

Parallel Training of Deep Networks with Local Updates

Deep learning models trained on large data sets have been widely success...

8 Michael (Misha) Laskin, et al. ∙

research

∙ 11/09/2020

Improving Neural Network Training in Low Dimensional Random Bases

Stochastic Gradient Descent (SGD) has proven to be remarkably effective ...

0 Frithjof Gressmann, et al. ∙

research

∙ 04/20/2018

Revisiting Small Batch Training for Deep Neural Networks

Modern deep neural network training is typically based on mini-batch sto...

0 Dominic Masters, et al. ∙

Carlo Luschi

Featured Co-authors

Sign in with Google

Consider DeepAI Pro