Squashed merge for fine matrix solver (#640)
Add a new Hines matrix solver implementation for the GPU that can solve a single tree in parallel with multiple threads. It replaces the interleaved solver, which used a single thread to solve each matrix. Branches with the same common root in the tree can be solved independently on each of the forward and backward solution passes. * Add a matrix storage type, `arb::gpu::matrix_state_fine` that stores the branches of multiple trees for efficient backward and forward substitution. * Extend the `arb::tree` data structure to support operations for choosing a new root node and determining a root node which minimises the maximum distance between the root and any of the trees leaves. * Implement code for rebalancing a set of matrix trees, a.k.a. a "forest" of trees. * Add CUDA kernels for efficiently performing matrix assembly and matrix solution steps. * Add CMake option `ARB_WITH_GPU_FINE_MATRIX` for toggling the new solver (default `on`).
Showing
- CMakeLists.txt 4 additions, 0 deletionsCMakeLists.txt
- arbor/CMakeLists.txt 4 additions, 0 deletionsarbor/CMakeLists.txt
- arbor/algorithms.hpp 23 additions, 8 deletionsarbor/algorithms.hpp
- arbor/backends/gpu/forest.cpp 181 additions, 0 deletionsarbor/backends/gpu/forest.cpp
- arbor/backends/gpu/forest.hpp 140 additions, 0 deletionsarbor/backends/gpu/forest.hpp
- arbor/backends/gpu/fvm.hpp 11 additions, 1 deletionarbor/backends/gpu/fvm.hpp
- arbor/backends/gpu/matrix_fine.cpp 71 additions, 0 deletionsarbor/backends/gpu/matrix_fine.cpp
- arbor/backends/gpu/matrix_fine.cu 314 additions, 0 deletionsarbor/backends/gpu/matrix_fine.cu
- arbor/backends/gpu/matrix_fine.hpp 77 additions, 0 deletionsarbor/backends/gpu/matrix_fine.hpp
- arbor/backends/gpu/matrix_solve.cu 1 addition, 0 deletionsarbor/backends/gpu/matrix_solve.cu
- arbor/backends/gpu/matrix_state_fine.hpp 494 additions, 0 deletionsarbor/backends/gpu/matrix_state_fine.hpp
- arbor/tree.cpp 385 additions, 0 deletionsarbor/tree.cpp
- arbor/tree.hpp 61 additions, 92 deletionsarbor/tree.hpp
- test/unit/test_gpu_stack.cu 3 additions, 0 deletionstest/unit/test_gpu_stack.cu
- test/unit/test_matrix.cu 15 additions, 75 deletionstest/unit/test_matrix.cu
- test/unit/test_matrix_cpuvsgpu.cpp 6 additions, 3 deletionstest/unit/test_matrix_cpuvsgpu.cpp
- test/unit/test_tree.cpp 83 additions, 0 deletionstest/unit/test_tree.cpp
Please register or sign in to comment