A fine-grained parallelization of the immersed boundary method
We present new algorithms for the parallelization of Eulerian-Lagrangian interaction operations in the immersed boundary method. Our algorithms rely on two well-studied parallel primitives: key-value sort and segmented reduce. The use of these parallel primitives allows us to implement our algorithms on both graphics processing units (GPUs) and on other shared memory architectures. We present strong and weak scaling tests on problems involving scattered points and elastic structures. Our tests show that our algorithms exhibit near-ideal scaling on both multicore CPUs and GPUs.
READ FULL TEXT