Skip to content
Snippets Groups Projects
Commit c23c694a authored by noraabiakar's avatar noraabiakar Committed by Sam Yates
Browse files

Optimize vectorized compound_indexed_add (#673)

* Optimize "none" index_constraint specialization of compound_indexed_add, so that it only reads/writes each distinct memory index once per vector.

Related to issue #637.
parent 5574e05f
No related branches found
No related tags found
No related merge requests found
......@@ -220,9 +220,15 @@ namespace simd_detail {
scalar_type a[width];
Impl::copy_to(s, a);
for (unsigned i = 0; i<width; ++i) {
p[o[i]] += a[i];
scalar_type temp = 0;
for (unsigned i = 0; i<width-1; ++i) {
temp += a[i];
if (o[i] != o[i+1]) {
p[o[i]] += temp;
temp = 0;
}
}
p[o[width-1]] = temp;
}
break;
case index_constraint::independent:
......
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment