GPU-friendly, Parallel, and (Almost-)In-Place Construction of Left-Balanced k-d Trees
We present an algorithm that allows for building left-balanced and complete k-d trees over k-dimensional points in a trivially parallel and GPU friendly way. Our algorithm requires exactly one int per data point as temporary storage, and uses O(log N) iterations, each of which performs one parallel sort, and one trivially parallel CUDA per-node update kernel.
READ FULL TEXT