←
Optimizing Parallel Reduction in CUDA - Nvidia