Skip to content
Snippets Groups Projects
Commit 1f188dcd authored by Ben Cumming's avatar Ben Cumming Committed by Sam Yates
Browse files

Improve reduce by key GPU performance. (#301)

Optimized reduce by key used by the GPU back end when accumulating synapse current contributions to compartment currents. This leads to significant speedup in the miniapp for cells with few compartments and many synapses.

* Implement `gpu::reduce_by_key` device function that uses warp intrinsics to perform reduction between threads in a warp before using a global atomic update to store the result.
* Add unit tests for `reduce_by_key` functionality.
* Add micro benchmarks that compare against using CUDA atomics.
* Modify `CudaPrinter` modcc class to emit `reduce_by_key` in place of `cudaAtomicAdd` functions.

Some improvements to meter reporting:
* Shorten names of metering regions in miniapp to make them easier to grep.
* JSON is no longer used as an intermediate data type when gathering distributed meters into a single report, instead conversion to JSON is performed just before writing to file.
* Add a print function for summarizing meter results t...
parent 1a58e003
No related branches found
No related tags found
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment