With the emergence of large foundational models, model-serving systems a...
A growing number of applications depend on Machine Learning (ML)
functio...
With the increase in the scale of Deep Learning (DL) training workloads ...
Federated Learning (FL) is a well-established technique for privacy
pres...
Efficient inference of Deep Neural Networks (DNNs) is essential to makin...
Machine Learning (ML) research has focused on maximizing the accuracy of...
This paper introduces a novel approach to automatic ahead-of-time (AOT)
...
The emergence of CNNs in mainstream deployment has necessitated methods ...
Deep learning models have achieved expert-level performance in healthcar...
Function-as-a-Service (FaaS) platforms and "serverless" cloud computing ...
Prior research in resource scheduling for machine learning training work...
Current trends in Machine Learning (ML) inference on hardware accelerate...
Serving deep neural networks in latency critical interactive settings of...
Serverless computing offers the potential to program the cloud in an
aut...
The dominant cost in production machine learning workloads is not traini...
The next generation of AI applications will continuously interact with t...
Advances in deep learning have led to substantial increases in predictio...