__host__ __device__ – Generic programming in Cuda
We present patterns for Cuda/C++ to write save generic code which works both on the host and device side. Writing templated functions in Cuda/C++ both for the CPU and the GPU bears the problem that in general both __host__ and __device__ functions are instantiated, which leads to lots of compiler warnings or errors.
READ FULL TEXT