In this paper we consider the online Submodular Welfare (SW) problem. In...
We consider the Max-3-Section problem, where we are given an undirected
...
The prevalence of large-scale multimodal datasets presents unique challe...
Many recent improvements in NLP stem from the development and use of lar...
We introduce an extensive dataset for multilingual probing of morphologi...
Adaptive inference is a simple method for reducing inference costs. The
...
NLP models often rely on superficial cues known as dataset biases to ach...
Speech language models (SpeechLMs) process and generate acoustic data on...
Weird, unusual, and uncanny images pique the curiosity of observers beca...
A core process in human cognition is analogical mapping: the ability to
...
The attention mechanism is considered the backbone of the widely-used
Tr...
Getting the most out of limited resources allows advances in natural lan...
While vision-and-language models perform well on tasks such as visual
qu...
The size of pretrained models is increasing, and so is their performance...
By providing unprecedented access to computational resources, cloud comp...
Recent work has shown that deep learning models in NLP are highly sensit...
The remarkable success of large transformer-based models such as BERT,
R...
Purpose - To develop and validate a deep learning (DL) framework for the...
Pretrained language models are typically trained on massive web-based
da...
Transformer architectures have achieved state-of-the-art results on a va...
Research in NLP is often supported by experimental results, and improved...
Masked language modeling (MLM) is one of the key sub-tasks in vision-lan...
Motivated by the classic Generalized Assignment Problem, we consider the...
In this work, we initiate the study of fault tolerant Max Cut, where giv...
We consider the 0-Extension problem, where we are given an undirected gr...
Language models trained on billions of tokens have recently led to
unpre...
Recent works have shown that supervised models often exploit data artifa...
Transformers are state-of-the-art models for a variety of sequence model...
Many algorithms for maximizing a monotone submodular function subject to...
The capacity of neural networks like the widely adopted transformer is k...
The urgency of mitigating COVID-19 has spawned a large and diverse body ...
Multi-head attentive neural architectures have achieved state-of-the-art...
We develop a formal hierarchy of the expressive capacity of RNN
architec...
As NLP models become larger, executing a trained model requires signific...
Fine-tuning pretrained contextual word embedding models to supervised
do...
Contextual word representations, typically trained on unstructured, unla...
Neural models for NLP typically use large numbers of parameters to reach...
Research in natural language processing proceeds, in part, by demonstrat...
We present PaLM, a hybrid parser and neural language model. Building on ...
The computations required for deep learning research have been doubling ...
Correlation clustering is a fundamental combinatorial optimization probl...
Motivated by the use of high speed circuit switches in large scale data
...
Several datasets have recently been constructed to expose brittleness in...
Semidefinite programming is a powerful tool in the design and analysis o...
Despite the tremendous empirical success of neural models in natural lan...
Given a partial description like "she opened the hood of the car," human...
While recurrent neural networks have found success in a variety of natur...
Recurrent and convolutional neural networks comprise two distinct famili...
Motivated by applications in machine learning, such as subset selection ...
Peer reviewing is a central component in the scientific publishing proce...