Generative Large Language Models (LLMs) have demonstrated remarkable res...
Pre-trained machine learning (ML) models have shown great performance fo...
Recent advances in state-of-the-art DNN architecture design have been mo...
The recent emergence of Large Language Models based on the Transformer
a...
Physics-informed neural networks (PINNs) incorporate physical knowledge ...
The recently proposed Conformer model has become the de facto backbone m...
Recent work in scientific machine learning has developed so-called
physi...
A major challenge in deploying transformer models is their prohibitive
i...
End-to-end neural network models achieve improved performance on various...
As soon as abstract mathematical computations were adapted to computatio...
Pruning is an effective method to reduce the memory footprint and FLOPs
...
Transformer based models, like BERT and RoBERTa, have achieved
state-of-...
Quantization is one of the key techniques used to make Neural Networks (...
Robustness of machine learning models to various adversarial and
non-adv...
We introduce AdaHessian, a second order stochastic optimization algorith...
The standard normalization method for neural network (NN) models used in...
Quantization is a promising approach for reducing the inference time and...
We present PyHessian, a new scalable framework that enables fast computa...
Quantization is an effective method for reducing memory footprint and
in...
Modern neural networks are increasingly bottlenecked by the limited capa...
Transformer based architectures have become de-facto models used for a r...
It has been observed that residual networks can be viewed as the explici...
In stochastic optimization, large batch training can leverage parallel
r...
Residual neural networks can be viewed as the forward Euler discretizati...
Deep Neural Networks are quite vulnerable to adversarial perturbations.
...
Optimal parameter initialization remains a crucial problem for neural ne...
Increasing the mini-batch size for stochastic gradient descent offers
si...
Gliomas are the most common primary brain malignancies, with different
d...
We propose a segmentation framework that uses deep neural networks and
i...
Stochastic Gradient Descent (SGD) methods using randomly selected batche...
We introduce CLAIRE, a distributed-memory algorithm and software for sol...
Deep Learning is arguably the most rapidly evolving research area in rec...
One of the main barriers for deploying neural networks on embedded syste...
PDE-constrained optimization problems find many applications in medical ...
Large batch size training of Neural Networks has been shown to incur acc...
We propose a new integrated method of exploiting model, batch and domain...
We propose a new integrated method of exploiting both model and data
par...
We present a parallel distributed-memory algorithm for large deformation...