Use native cuda atomicAdd on Pascal (#174)
Fixes #125 * Add `cuda_atomic_add` and `cuda_atomic_sub` wrappers for atomic addition. * Choose native atomic add for Pascal and later architectures. * Choose CAS workaround for devices earlier than Pascal. * Add unit test for wrappers. * Change default CUDA architecture target to `sm_60` in `CMakeLists.txt`.
Showing
- CMakeLists.txt 1 addition, 2 deletionsCMakeLists.txt
- modcc/cudaprinter.cpp 2 additions, 16 deletionsmodcc/cudaprinter.cpp
- src/backends/gpu_intrinsics.hpp 41 additions, 0 deletionssrc/backends/gpu_intrinsics.hpp
- tests/unit/CMakeLists.txt 1 addition, 0 deletionstests/unit/CMakeLists.txt
- tests/unit/test_atomics.cu 53 additions, 0 deletionstests/unit/test_atomics.cu
Please register or sign in to comment