The increasing availability of machines relying on non-GPU architectures...
Ranging from NVIDIA GPUs to AMD GPUs and Intel GPUs: Given the heterogen...
Parallel programming remains a daunting challenge, from the struggle to
...
Meeting both scalability and performance portability requirements is a
c...
Octo-Tiger, a large-scale 3D AMR code for the merger of stars, uses a
co...
Understanding the behavior of software in execution is a key step in
ide...
Asynchronous Many-Task (AMT) runtime systems take advantage of multi-cor...
On the way to Exascale, programmers face the increasing challenge of hav...
Octo-Tiger is a code for modeling three-dimensional self-gravitating
ast...
Scientific applications that run on leadership computing facilities ofte...
Analyzing performance within asynchronous many-task-based runtime system...
Exceptions and errors occurring within mission critical applications due...
This paper describes how we successfully used the HPX programming model ...
Arm technology is becoming increasingly important in HPC. Recently, Fuga...
Although recent scaling up approaches to train deep neural networks have...
Exceptions and errors occurring within mission critical applications due...
OpenMP has been the de facto standard for single node parallelism for mo...
We study the simulation of stellar mergers, which requires complex
simul...
Asynchronous Many-task (AMT) runtime systems have gained increasing
acce...
Experience shows that on today's high performance systems the utilizatio...
Despite advancements in the areas of parallel and distributed computing,...
Peridynamics is a non-local generalization of continuum mechanics tailor...
The performance of many parallel applications depends on loop-level
para...