Transformers have become the dominant model in deep learning, but the re...
We focus on the problem of learning without forgetting from multiple tas...
Stochastic gradient descent plays a fundamental role in nearly all
appli...
Transformers have become the state-of-the-art neural network architectur...
Decentralized learning with private data is a central problem in machine...
In this paper we propose augmenting Vision Transformer models with learn...
The architecture and the parameters of neural networks are often optimiz...
In this work we propose a HyperTransformer, a transformer-based model fo...
In this paper, we introduce a new type of generalized neural network whe...
ML models often exhibit unexpectedly poor behavior when they are deploye...
Nonlinear embedding manifold learning methods provide invaluable visual
...