Inspired by Gauss-Newton-like methods, we study the benefit of leveragin...
The top-k operator returns a k-sparse vector, where the non-zero values
...
Regularized optimal transport (OT) is now increasingly used as a loss or...
Energy-based models, a.k.a. energy networks, perform inference by optimi...
Tuning the step size of stochastic gradient descent is tedious and error...
Attention based models such as Transformers involve pairwise interaction...
Exponential families are widely used in machine learning; they include m...
Automatic differentiation (autodiff) has revolutionized machine learning...
Finding the optimal hyperparameters of a model can be cast as a bilevel
...
Self-supervised pre-training using so-called "pretext" tasks has recentl...
The training of deep residual neural networks (ResNets) with backpropaga...
Computing the discrepancy between time series of variable sizes is
notor...
Setting regularization parameters for Lasso-type estimators is notorious...
The sorting operation is one of the most basic and commonly used buildin...
Machine learning pipelines often rely on optimization procedures to make...
We propose in this paper a general framework for deriving loss functions...
Building upon recent advances in entropy-regularized optimal transport, ...
Over the past decades, numerous loss functions have been been proposed f...
We study in this paper Fenchel-Young losses, a generic way to construct
...
Optimal transport as a loss for machine learning optimization problems h...
Structured prediction requires searching over a combinatorial number of
...
Dynamic programming (DP) solves a variety of structured combinatorial
pr...
This paper presents a novel two-step approach for the fundamental proble...
Entropic regularization is quickly emerging as a new standard in optimal...
Modern neural networks are often augmented with an attention mechanism, ...
Factorization machines and polynomial networks are supervised polynomial...
We propose in this paper a differentiable learning loss between time ser...
Polynomial networks and factorization machines are two recently-proposed...
Factorization machines (FMs) are a supervised learning approach that can...