b'Neil Houlsby'

research

∙ 09/15/2023

Scaling Laws for Sparsely-Connected Foundation Models

We explore the impact of parameter sparsity on the scaling behavior of T...

0 Elias Frantar, et al. ∙

research

∙ 08/02/2023

From Sparse to Soft Mixtures of Experts

Sparse mixture of expert architectures (MoEs) scale model capacity witho...

0 Joan Puigcerver, et al. ∙

research

∙ 07/12/2023

Patch n' Pack: NaViT, a Vision Transformer for any Aspect Ratio and Resolution

The ubiquitous and demonstrably suboptimal choice of resizing images to ...

0 Mostafa Dehghani, et al. ∙

research

∙ 06/16/2023

Scaling Open-Vocabulary Object Detection

Open-vocabulary object detection has benefited greatly from pretrained v...

0 Matthias Minderer, et al. ∙

research

∙ 01/30/2023

Massively Scaling Heteroscedastic Classifiers

Heteroscedastic classifiers, which learn a multivariate Gaussian distrib...

0 Mark Collier, et al. ∙

research

∙ 12/15/2022

Image-and-Language Understanding from Pixels Only

Multimodal models are becoming increasingly effective, in part due to un...

0 Michael Tschannen, et al. ∙

research

∙ 12/09/2022

Sparse Upcycling: Training Mixture-of-Experts from Dense Checkpoints

Training large, deep neural networks to convergence can be prohibitively...

0 Aran Komatsuzaki, et al. ∙

research

∙ 12/05/2022

Location-Aware Self-Supervised Transformers

Pixel-level labels are particularly expensive to acquire. Hence, pretrai...

0 Mathilde Caron, et al. ∙

research

∙ 10/20/2022

Transcending Scaling Laws with 0.1

Scaling language models improves performance but comes with significant ...

2 Yi Tay, et al. ∙

research

∙ 09/14/2022

PaLI: A Jointly-Scaled Multilingual Language-Image Model

Effective scaling and a flexible task interface enable large language mo...

6 Xi Chen, et al. ∙

research

∙ 06/06/2022

Multimodal Contrastive Learning with LIMoE: the Language-Image Mixture of Experts

Large sparsely-activated models have obtained excellent performance in m...

6 Basil Mustafa, et al. ∙

research

∙ 05/20/2022

UViM: A Unified Modeling Approach for Vision with Learned Guiding Codes

We introduce UViM, a unified approach capable of modeling a wide range o...

16 Alexander Kolesnikov, et al. ∙

research

∙ 05/19/2022

Robust and Efficient Medical Imaging with Self-Supervision

Recent progress in Medical Artificial Intelligence (AI) has delivered sy...

24 Shekoofeh Azizi, et al. ∙

research

∙ 02/24/2022

Learning to Merge Tokens in Vision Transformers

Transformers are widely applied to solve natural language understanding ...

3 Cedric Renggli, et al. ∙

research

∙ 10/07/2021

Sparse MoEs meet Efficient Ensembles

Machine learning models based on the aggregated outputs of submodels, ei...

3 James Urquhart Allingham, et al. ∙

research

∙ 07/14/2021

The Benchmark Lottery

The world of empirical machine learning (ML) strongly relies on benchmar...

8 Mostafa Dehghani, et al. ∙

research

∙ 06/15/2021

Revisiting the Calibration of Modern Neural Networks

Accurate estimation of predictive uncertainty (model calibration) is ess...

18 Matthias Minderer, et al. ∙

research

∙ 06/10/2021

Scaling Vision with Sparse Mixture of Experts

Sparsely-gated Mixture of Experts networks (MoEs) have demonstrated exce...

6 Carlos Riquelme, et al. ∙

research

∙ 06/08/2021

Scaling Vision Transformers

Attention-based neural networks such as the Vision Transformer (ViT) hav...

9 Xiaohua Zhai, et al. ∙

research

∙ 05/04/2021

MLP-Mixer: An all-MLP Architecture for Vision

Convolutional Neural Networks (CNNs) are the go-to model for computer vi...

18 Ilya Tolstikhin, et al. ∙

research

∙ 04/09/2021

SI-Score: An image dataset for fine-grained analysis of robustness to object location, rotation and size

Before deploying machine learning models it is critical to assess their ...

10 Jessica Yung, et al. ∙

research

∙ 04/06/2021

Comparing Transfer and Meta Learning Approaches on a Unified Few-Shot Classification Benchmark

Meta and transfer learning are two successful families of approaches to ...

8 Vincent Dumoulin, et al. ∙

research

∙ 01/14/2021

Supervised Transfer Learning at Scale for Medical Imaging

Transfer learning is a standard technique to improve performance on task...

30 Basil Mustafa, et al. ∙

research

∙ 11/06/2020

Underspecification Presents Challenges for Credibility in Modern Machine Learning

ML models often exhibit unexpectedly poor behavior when they are deploye...

30 Alexander D'Amour, et al. ∙

research

∙ 10/22/2020

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

While the Transformer architecture has become the de-facto standard for ...

6 Alexey Dosovitskiy, et al. ∙

research

∙ 10/14/2020

Deep Ensembles for Low-Data Transfer Learning

In the low-data regime, it is difficult to train good supervised models ...

19 Basil Mustafa, et al. ∙

research

∙ 10/06/2020

Representation learning from videos in-the-wild: An object-centric approach

We propose a method to learn image representations from uncurated videos...

6 Rob Romijnders, et al. ∙

research

∙ 09/30/2020

Training general representations for remote sensing using in-domain knowledge

Automatically finding good and general remote sensing representations al...

7 Maxim Neumann, et al. ∙

research

∙ 09/28/2020

Scalable Transfer Learning with Expert Models

Transfer of pre-trained representations can improve sample efficiency an...

11 Joan Puigcerver, et al. ∙

research

∙ 07/16/2020

On Robustness and Transferability of Convolutional Neural Networks

Modern deep convolutional networks (CNNs) are often criticized for not g...

15 Josip Djolonga, et al. ∙

research

∙ 02/20/2020

Automatic Shortcut Removal for Self-Supervised Representation Learning

In self-supervised visual representation learning, a feature extractor i...

0 Matthias Minderer, et al. ∙

research

∙ 12/24/2019

Large Scale Learning of General Visual Representations for Transfer

Transfer of pre-trained representations improves sample efficiency and s...

10 Alexander Kolesnikov, et al. ∙

research

∙ 12/05/2019

Self-Supervised Learning of Video-Induced Visual Invariances

We propose a general framework for self-supervised learning of transfera...

14 Michael Tschannen, et al. ∙

research

∙ 11/15/2019

In-domain representation learning for remote sensing

Given the importance of remote sensing, surprisingly little attention ha...

36 Maxim Neumann, et al. ∙

research

∙ 02/02/2019

Parameter-Efficient Transfer Learning for NLP

Fine-tuning large pre-trained models is an effective transfer mechanism ...

0 Neil Houlsby, et al. ∙

research

∙ 12/27/2018

Neural Architecture Search Over a Graph Search Space

Neural architecture search (NAS) enabled the discovery of state-of-the-a...

0 Quentin de Laroussilhe, et al. ∙

research

∙ 11/27/2018

Self-Supervised Generative Adversarial Networks

Conditional GANs are at the forefront of natural image synthesis. The ma...

0 Ting Chen, et al. ∙

research

∙ 10/27/2018

Self-Supervised GAN to Counter Forgetting

GANs involve training two networks in an adversarial game, where each ne...

0 Ting Chen, et al. ∙

research

∙ 10/02/2018

On Self Modulation for Generative Adversarial Networks

Training Generative Adversarial Networks (GANs) is notoriously challengi...

0 Ting Chen, et al. ∙

research

∙ 03/07/2018

Transfer Automatic Machine Learning

Building effective neural networks requires many design choices. These i...

0 Catherine Wong, et al. ∙

research

∙ 01/23/2018

Analyzing Language Learned by an Active Question Answering Agent

We analyze the language learned by an agent trained with reinforcement l...

0 Christian Buck, et al. ∙

research

∙ 05/22/2017

Ask the Right Questions: Active Question Reformulation with Reinforcement Learning

We frame Question Answering as a Reinforcement Learning task, an approac...

0 Christian Buck, et al. ∙

research

∙ 09/02/2013

Scalable Probabilistic Entity-Topic Modeling

We present an LDA approach to entity disambiguation. Each topic is assoc...

0 Neil Houlsby, et al. ∙

research

∙ 12/24/2011

Bayesian Active Learning for Classification and Preference Learning

Information theoretic active learning has been widely studied for probab...

0 Neil Houlsby, et al. ∙

Neil Houlsby

Featured Co-authors

Sign in with Google

Consider DeepAI Pro