We explore the impact of parameter sparsity on the scaling behavior of
T...
Sparse mixture of expert architectures (MoEs) scale model capacity witho...
The ubiquitous and demonstrably suboptimal choice of resizing images to ...
Open-vocabulary object detection has benefited greatly from pretrained
v...
Heteroscedastic classifiers, which learn a multivariate Gaussian distrib...
Multimodal models are becoming increasingly effective, in part due to un...
Training large, deep neural networks to convergence can be prohibitively...
Pixel-level labels are particularly expensive to acquire. Hence, pretrai...
Scaling language models improves performance but comes with significant
...
Effective scaling and a flexible task interface enable large language mo...
Large sparsely-activated models have obtained excellent performance in
m...
We introduce UViM, a unified approach capable of modeling a wide range o...
Recent progress in Medical Artificial Intelligence (AI) has delivered sy...
Transformers are widely applied to solve natural language understanding ...
Machine learning models based on the aggregated outputs of submodels, ei...
The world of empirical machine learning (ML) strongly relies on benchmar...
Accurate estimation of predictive uncertainty (model calibration) is
ess...
Sparsely-gated Mixture of Experts networks (MoEs) have demonstrated exce...
Attention-based neural networks such as the Vision Transformer (ViT) hav...
Convolutional Neural Networks (CNNs) are the go-to model for computer vi...
Before deploying machine learning models it is critical to assess their
...
Meta and transfer learning are two successful families of approaches to
...
Transfer learning is a standard technique to improve performance on task...
ML models often exhibit unexpectedly poor behavior when they are deploye...
While the Transformer architecture has become the de-facto standard for
...
In the low-data regime, it is difficult to train good supervised models ...
We propose a method to learn image representations from uncurated videos...
Automatically finding good and general remote sensing representations al...
Transfer of pre-trained representations can improve sample efficiency an...
Modern deep convolutional networks (CNNs) are often criticized for not
g...
In self-supervised visual representation learning, a feature extractor i...
Transfer of pre-trained representations improves sample efficiency and
s...
We propose a general framework for self-supervised learning of transfera...
Given the importance of remote sensing, surprisingly little attention ha...
Fine-tuning large pre-trained models is an effective transfer mechanism ...
Neural architecture search (NAS) enabled the discovery of state-of-the-a...
Conditional GANs are at the forefront of natural image synthesis. The ma...
GANs involve training two networks in an adversarial game, where each
ne...
Training Generative Adversarial Networks (GANs) is notoriously challengi...
Building effective neural networks requires many design choices. These
i...
We analyze the language learned by an agent trained with reinforcement
l...
We frame Question Answering as a Reinforcement Learning task, an approac...
We present an LDA approach to entity disambiguation. Each topic is assoc...
Information theoretic active learning has been widely studied for
probab...