block-interleaved gpu matrix solver (#208)
Fixes #185. Add a new back end GPU Hines matrix solver that uses a block-interleaved storage pattern to improve memory coalescing during the matrix solve. * Refactor the `src/backends` path into `src/backends/gpu` and `src/backends/multicore` paths that contain `gpu` and `multicore` implementations. * Refactor the matrix state and threshold detection members that were declared inline in the back end specifications to separate files. * Add a new interleaved matrix state back end. * Refactor all of the GPU kernels that were originally in the one back end header file into their own header files. * Write more comprehensive unit tests for the GPU matrix solver back end to test the `interleave` and `reverse_interleave` operations in isolation, as well as ensure that the flat and interleaved back ends produce identical results. * Add the GPU versions of the kinetic scheme validation tests.
Showing
- modcc/cudaprinter.cpp 1 addition, 1 deletionmodcc/cudaprinter.cpp
- src/CMakeLists.txt 2 additions, 2 deletionssrc/CMakeLists.txt
- src/backends/fvm.hpp 2 additions, 2 deletionssrc/backends/fvm.hpp
- src/backends/fvm_gpu.hpp 0 additions, 458 deletionssrc/backends/fvm_gpu.hpp
- src/backends/fvm_multicore.hpp 0 additions, 286 deletionssrc/backends/fvm_multicore.hpp
- src/backends/gpu/fvm.cu 6 additions, 2 deletionssrc/backends/gpu/fvm.cu
- src/backends/gpu/fvm.hpp 89 additions, 0 deletionssrc/backends/gpu/fvm.hpp
- src/backends/gpu/intrinsics.hpp 0 additions, 0 deletionssrc/backends/gpu/intrinsics.hpp
- src/backends/gpu/kernels/assemble_matrix.hpp 104 additions, 0 deletionssrc/backends/gpu/kernels/assemble_matrix.hpp
- src/backends/gpu/kernels/detail.hpp 82 additions, 0 deletionssrc/backends/gpu/kernels/detail.hpp
- src/backends/gpu/kernels/interleave.hpp 134 additions, 0 deletionssrc/backends/gpu/kernels/interleave.hpp
- src/backends/gpu/kernels/solve_matrix.hpp 78 additions, 0 deletionssrc/backends/gpu/kernels/solve_matrix.hpp
- src/backends/gpu/kernels/test_thresholds.hpp 61 additions, 0 deletionssrc/backends/gpu/kernels/test_thresholds.hpp
- src/backends/gpu/matrix_state_flat.hpp 118 additions, 0 deletionssrc/backends/gpu/matrix_state_flat.hpp
- src/backends/gpu/matrix_state_interleaved.hpp 256 additions, 0 deletionssrc/backends/gpu/matrix_state_interleaved.hpp
- src/backends/gpu/stack.hpp 3 additions, 3 deletionssrc/backends/gpu/stack.hpp
- src/backends/gpu/stimulus.hpp 3 additions, 13 deletionssrc/backends/gpu/stimulus.hpp
- src/backends/gpu/threshold_watcher.hpp 148 additions, 0 deletionssrc/backends/gpu/threshold_watcher.hpp
- src/backends/matrix_storage.md 108 additions, 0 deletionssrc/backends/matrix_storage.md
- src/backends/multicore/fvm.cpp 1 addition, 1 deletionsrc/backends/multicore/fvm.cpp
Please register or sign in to comment