Obtaining versions of deep neural networks that are both highly-accurate...
Leveraging second-order information at the scale of deep networks is one...
Recent vision architectures and self-supervised training methods enable
...
We provide a new efficient version of the backpropagation algorithm,
spe...
The breakthrough performance of large language models (LLMs) comes with ...
Models from the Vision Transformer (ViT) family have recently provided
b...
We revisit the performance of the classic gradual magnitude pruning (GMP...
Artificial Intelligence (AI) is one of the most promising technologies o...
Pre-trained Transformer-based language models have become a key building...
Efficiently approximating local curvature information of the loss functi...