GPU-based Data-parallel Rendering of Large, Unstructured, and Non-convexly Partitioned Data
Computational fluid dynamic simulations often produce large clusters of finite elements with non-trivial, non-convex boundaries and uneven distributions among compute nodes, posing challenges to compositing during interactive volume rendering. Correct, in-place visualization of such clusters becomes difficult because viewing rays straddle domain boundaries across multiple compute nodes. We propose a GPU-based, scalable, memory-efficient direct volume visualization framework suitable for in situ and post hoc usage. Our approach reduces memory usage of the unstructured volume elements by leveraging an exclusive or-based index reduction scheme and provides fast ray-marching-based traversal without requiring large external data structures built over the elements themselves. Moreover, we present a GPU-optimized deep compositing scheme that allows correct order compositing of intermediate color values accumulated across different ranks that works even for non-convex clusters. Our method scales well on large data-parallel systems and achieves interactive frame rates during visualization. We can interactively render both Fun3D Small Mars Lander (14 GB / 798.4 million finite elements) and Huge Mars Lander (111.57 GB / 6.4 billion finite elements) data sets at 14 and 10 frames per second using 72 and 80 GPUs, respectively, on TACC's Frontera supercomputer.
READ FULL TEXT