Pushing the boundaries of machine learning often requires exploring diff...
The Mixture of Experts (MoE) is a widely known neural architecture where...
Large volumes of text data have contributed significantly to the develop...
Advanced AI models hold the promise of tremendous benefits for humanity,...
Generative AI systems across modalities, ranging from text, image, audio...
Emergent properties have been widely adopted as a term to describe behav...
Perception of toxicity evolves over time and often differs between
geogr...
Ensembling independent deep neural networks (DNNs) is a simple and effec...
Multilingual models are often particularly dependent on scaling to gener...
Despite widespread use of LLMs as conversational agents, evaluations of
...
Modern machine learning research relies on relatively few carefully cura...
Getting the most out of limited resources allows advances in natural lan...
We study the impact of different pruning techniques on the representatio...
Knowledge distillation has proven to be an effective technique in improv...
How do neural network image classifiers respond to simpler and simpler
i...
A "bigger is better" explosion in the number of parameters in deep neura...
As machine learning models are increasingly employed to assist human
dec...
Not all examples are created equal, but standard deep neural network tra...
The quest for determinism in machine learning has disproportionately foc...
Training sparse networks to converge to the same performance as dense ne...
The popularity and widespread use of pruning and quantization is driven ...
Hardware, systems and algorithms research communities have historically ...
In machine learning, a question of great interest is understanding what
...
With the recent wave of progress in artificial intelligence (AI) has com...
Neural network pruning techniques have demonstrated it is possible to re...
We rigorously evaluate three state-of-the-art techniques for inducing
sp...
Estimating the influence of a given feature to a model prediction is
cha...
Saliency methods aim to explain the predictions of deep neural networks....