Simd partition by constraint (#494)
Changes have been made to the simd implementation of mechansim functions: - The node_index array (array of indices that specifies for each mechanism the CVs where it is present), is now partitioned into 4 arrays according to the constraint on each simd_vector in node_index: 1. contiguous array: contains the indices of all simd_vectors in node_index where the elements in simd_vector are contiguous 2. constant array: contains the indices of all simd_vectors in node_index where the elements in simd_vector are identical 3. independent array: contains the indices of all simd_vectors in node_index where the elements in simd_vector are independent (no repetitions) but not contiguous 4. none array: contains the indices of all simd_vectors in node_index where the none of the above constraints apply When mechanism functions are executed, they loop over each of the 4 arrays separately. This allows for optimizations in every category. - The modcc compiler was modified to generate code for the previous changes, including the optimizations per constraint: 1. contiguous array: we use vector load/store and vector arithmetic. 2. constant array: we load only one element and broadcast it into a simd_vector; we use vector arithmetic; we reduce the result; we store one element. 3. indepndent array: we use vector scatter/gather and vector arithmetic. 4. none array: we cannot operate on the simd_vector in parallel, we loop over the elements to read, perform arithmetic and write back - Added a mechanism benchmark for pas, hh and expsyn - Moved/modified some functions in simd.hpp to ensure that the correct implementation of a function is being called.
Showing
- modcc/printer/cprinter.cpp 156 additions, 41 deletionsmodcc/printer/cprinter.cpp
- modcc/printer/cprinter.hpp 11 additions, 0 deletionsmodcc/printer/cprinter.hpp
- src/backends/multicore/mechanism.cpp 7 additions, 0 deletionssrc/backends/multicore/mechanism.cpp
- src/backends/multicore/mechanism.hpp 2 additions, 0 deletionssrc/backends/multicore/mechanism.hpp
- src/backends/multicore/partition_by_constraint.hpp 118 additions, 0 deletionssrc/backends/multicore/partition_by_constraint.hpp
- src/fvm_lowered_cell_impl.hpp 5 additions, 0 deletionssrc/fvm_lowered_cell_impl.hpp
- src/simd/implbase.hpp 0 additions, 39 deletionssrc/simd/implbase.hpp
- src/simd/simd.hpp 83 additions, 4 deletionssrc/simd/simd.hpp
- tests/ubench/CMakeLists.txt 2 additions, 0 deletionstests/ubench/CMakeLists.txt
- tests/ubench/mech_vec.cpp 363 additions, 0 deletionstests/ubench/mech_vec.cpp
- tests/unit/CMakeLists.txt 1 addition, 0 deletionstests/unit/CMakeLists.txt
- tests/unit/test_partition_by_constraint.cpp 153 additions, 0 deletionstests/unit/test_partition_by_constraint.cpp
- tests/unit/test_simd.cpp 2 additions, 0 deletionstests/unit/test_simd.cpp
Please register or sign in to comment