Mathieu Blondel

research

∙ 08/17/2023

Dual Gauss-Newton Directions for Deep Learning

Inspired by Gauss-Newton-like methods, we study the benefit of leveragin...

0 Vincent Roulet, et al. ∙

research

∙ 02/02/2023

Fast, Differentiable and Sparse Top-k: a Convex Analysis Perspective

The top-k operator returns a k-sparse vector, where the non-zero values ...

0 Michael E. Sander, et al. ∙

research

∙ 09/30/2022

Sparsity-Constrained Optimal Transport

Regularized optimal transport (OT) is now increasingly used as a loss or...

0 Tianlin Liu, et al. ∙

research

∙ 05/19/2022

Learning Energy Networks with Generalized Fenchel-Young Losses

Energy-based models, a.k.a. energy networks, perform inference by optimi...

0 Mathieu Blondel, et al. ∙

research

∙ 02/24/2022

Cutting Some Slack for SGD with Adaptive Polyak Stepsizes

Tuning the step size of stochastic gradient descent is tedious and error...

2 Robert M. Gower, et al. ∙

research

∙ 10/22/2021

Sinkformers: Transformers with Doubly Stochastic Attention

Attention based models such as Transformers involve pairwise interaction...

0 Michael E. Sander, et al. ∙

research

∙ 08/04/2021

Sparse Continuous Distributions and Fenchel-Young Losses

Exponential families are widely used in machine learning; they include m...

0 André F. T. Martins, et al. ∙

research

∙ 05/31/2021

Efficient and Modular Implicit Differentiation

Automatic differentiation (autodiff) has revolutionized machine learning...

0 Mathieu Blondel, et al. ∙

research

∙ 05/04/2021

Implicit differentiation for fast hyperparameter selection in non-smooth convex learning

Finding the optimal hyperparameters of a model can be cast as a bilevel ...

0 Quentin Bertrand, et al. ∙

research

∙ 03/17/2021

Self-Supervised Learning of Audio Representations from Permutations with Differentiable Ranking

Self-supervised pre-training using so-called "pretext" tasks has recentl...

12 Andrew N. Carr, et al. ∙

research

∙ 02/15/2021

Momentum Residual Neural Networks

The training of deep residual neural networks (ResNets) with backpropaga...

0 Michael E. Sander, et al. ∙

research

∙ 10/16/2020

Differentiable Divergences Between Time Series

Computing the discrepancy between time series of variable sizes is notor...

0 Mathieu Blondel, et al. ∙

research

∙ 02/20/2020

Implicit differentiation of Lasso-type models for hyperparameter optimization

Setting regularization parameters for Lasso-type estimators is notorious...

7 Quentin Bertrand, et al. ∙

research

∙ 02/20/2020

Fast Differentiable Sorting and Ranking

The sorting operation is one of the most basic and commonly used buildin...

0 Mathieu Blondel, et al. ∙

research

∙ 02/20/2020

Learning with Differentiable Perturbed Optimizers

Machine learning pipelines often rely on optimization procedures to make...

34 Quentin Berthet, et al. ∙

research

∙ 10/24/2019

Structured Prediction with Projection Oracles

We propose in this paper a general framework for deriving loss functions...

0 Mathieu Blondel, et al. ∙

research

∙ 05/15/2019

Geometric Losses for Distributional Learning

Building upon recent advances in entropy-regularized optimal transport, ...

11 Arthur Mensch, et al. ∙

research

∙ 01/08/2019

Learning with Fenchel-Young Losses

Over the past decades, numerous loss functions have been been proposed f...

0 Mathieu Blondel, et al. ∙

research

∙ 05/24/2018

Learning Classifiers with Fenchel-Young Losses: Generalized Entropies, Margins, and Algorithms

We study in this paper Fenchel-Young losses, a generic way to construct ...

0 Mathieu Blondel, et al. ∙

research

∙ 02/15/2018

Blind Source Separation with Optimal Transport Non-negative Matrix Factorization

Optimal transport as a loss for machine learning optimization problems h...

0 Antoine Rolet, et al. ∙

research

∙ 02/12/2018

SparseMAP: Differentiable Sparse Structured Inference

Structured prediction requires searching over a combinatorial number of ...

0 Vlad Niculae, et al. ∙

research

∙ 02/11/2018

Differentiable Dynamic Programming for Structured Prediction and Attention

Dynamic programming (DP) solves a variety of structured combinatorial pr...

0 Arthur Mensch, et al. ∙

research

∙ 11/07/2017

Large-Scale Optimal Transport and Mapping Estimation

This paper presents a novel two-step approach for the fundamental proble...

0 Vivien Seguy, et al. ∙

research

∙ 10/17/2017

Smooth and Sparse Optimal Transport

Entropic regularization is quickly emerging as a new standard in optimal...

0 Mathieu Blondel, et al. ∙

research

∙ 05/22/2017

A Regularized Framework for Sparse and Structured Neural Attention

Modern neural networks are often augmented with an attention mechanism, ...

0 Vlad Niculae, et al. ∙

research

∙ 05/22/2017

Multi-output Polynomial Networks and Factorization Machines

Factorization machines and polynomial networks are supervised polynomial...

0 Mathieu Blondel, et al. ∙

research

∙ 03/05/2017

Soft-DTW: a Differentiable Loss Function for Time-Series

We propose in this paper a differentiable learning loss between time ser...

0 Marco Cuturi, et al. ∙

research

∙ 07/29/2016

Polynomial Networks and Factorization Machines: New Insights and Efficient Training Algorithms

Polynomial networks and factorization machines are two recently-proposed...

0 Mathieu Blondel, et al. ∙

research

∙ 07/25/2016

Higher-Order Factorization Machines

Factorization machines (FMs) are a supervised learning approach that can...

0 Mathieu Blondel, et al. ∙

Mathieu Blondel

Featured Co-authors

Sign in with Google

Consider DeepAI Pro