In this paper, we explore two fundamental first-order algorithms in conv...
We present a partially personalized formulation of Federated Learning (F...
This paper proposes a new easy-to-implement parameter-free gradient-base...
We present an algorithm for minimizing an objective with hard-to-compute...
In this work, we consider the problem of minimizing the sum of Moreau
en...
We analyze the performance of a variant of Newton method with quadratic
...
In this work, we propose new adaptive step size strategies that improve
...
The existing analysis of asynchronous stochastic gradient descent (SGD)
...
We introduce ProxSkip – a surprisingly simple and provably
efficient met...
We present a theoretical study of server-side optimization in federated
...
We present a Newton-type method that converges fast from any initializat...
We propose a family of lossy integer compressions for Stochastic Gradien...
Random Reshuffling (RR), also known as Stochastic Gradient Descent (SGD)...
Random Reshuffling (RR) is an algorithm for minimizing finite-sum functi...
We introduce a new primal-dual algorithm for minimizing the sum of three...
We present two new remarkably simple stochastic second-order methods for...
We present a strikingly simple proof that two rules are sufficient to
au...
We present a new perspective on the celebrated Sinkhorn algorithm by sho...
We revisit the local Stochastic Gradient Descent (local SGD) method and ...
We provide the first convergence analysis of local gradient descent for
...
When forecasting time series with a hierarchical structure, the existing...
We consider a new extension of the extragradient method that is motivate...
It is well known that many optimization methods, including SGD, SAGA, an...
Training very large machine learning models requires a distributed compu...
We propose a randomized first order optimization method--SEGA (SkEtched
...
We develop and analyze an asynchronous algorithm for distributed convex
...